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ETAPS Foreword 


Welcome to the proceedings of ETAPS 2018! After a somewhat coldish ETAPS 2017 
in Uppsala in the north, ETAPS this year took place in Thessaloniki, Greece. I am 
happy to announce that this is the first ETAPS with gold open access proceedings. This 
means that all papers are accessible by anyone for free. 

ETAPS 2018 was the 21st instance of the European Joint Conferences on Theory 
and Practice of Software. ETAPS is an annual federated conference established in 
1998, and consists of five conferences: ESOP, FASE, FoSSaCS, TACAS, and POST. 
Each conference has its own Program Committee (PC) and its own Steering Com- 
mittee. The conferences cover various aspects of software systems, ranging from 
theoretical computer science to foundations to programming language developments, 
analysis tools, formal approaches to software engineering, and security. Organizing 
these conferences in a coherent, highly synchronized conference program facilitates 
participation in an exciting event, offering attendees the possibility to meet many 
researchers working in different directions in the field, and to easily attend talks of 
different conferences. Before and after the main conference, numerous satellite work- 
shops take place and attract many researchers from all over the globe. 

ETAPS 2018 received 479 submissions in total, 144 of which were accepted, 
yielding an overall acceptance rate of 30%. I thank all the authors for their interest in 
ETAPS, all the reviewers for their peer reviewing efforts, the PC members for their 
contributions, and in particular the PC (co-)chairs for their hard work in running this 
entire intensive process. Last but not least, my congratulations to all authors of the 
accepted papers! 

ETAPS 2018 was enriched by the unifying invited speaker Martin Abadi (Google 
Brain, USA) and the conference-specific invited speakers (FASE) Pamela Zave (AT & 
T Labs, USA), (POST) Benjamin C. Pierce (University of Pennsylvania, USA), and 
(ESOP) Derek Dreyer (Max Planck Institute for Software Systems, Germany). Invited 
tutorials were provided by Armin Biere (Johannes Kepler University, Linz, Austria) on 
modern SAT solving and Fabio Somenzi (University of Colorado, Boulder, USA) on 
hardware verification. My sincere thanks to all these speakers for their inspiring and 
interesting talks! 

ETAPS 2018 took place in Thessaloniki, Greece, and was organised by the 
Department of Informatics of the Aristotle University of Thessaloniki. The university 
was founded in 1925 and currently has around 75,000 students; it is the largest uni- 
versity in Greece. ETAPS 2018 was further supported by the following associations 
and societies: ETAPS e.V., EATCS (European Association for Theoretical Computer 
Science), EAPLS (European Association for Programming Languages and Systems), 
and EASST (European Association of Software Science and Technology). The local 
organization team consisted of Panagiotis Katsaros (general chair), Ioannis Stamelos, 
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Lefteris Angelis, George Rahonis, Nick Bassiliades, Alexander Chatzigeorgiou, Ezio 
Bartocci, Simon Bliudze, Emmanouela Stachtiari, Kyriakos Georgiadis, and Petros 
Stratis (EasyConferences). 

The overall planning for ETAPS is the main responsibility of the Steering Com- 
mittee, and in particular of its Executive Board. The ETAPS Steering Committee 
consists of an Executive Board and representatives of the individual ETAPS confer- 
ences, as well as representatives of EATCS, EAPLS, and EASST. The Executive 
Board consists of Gilles Barthe (Madrid), Holger Hermanns (Saarbriicken), Joost-Pieter 
Katoen (chair, Aachen and Twente), Gerald Liittgen (Bamberg), Vladimiro Sassone 
(Southampton), Tarmo Uustalu (Tallinn), and Lenore Zuck (Chicago). Other members 
of the Steering Committee are: Wil van der Aalst (Aachen), Parosh Abdulla (Uppsala), 
Amal Ahmed (Boston), Christel Baier (Dresden), Lujo Bauer (Pittsburgh), Dirk Beyer 
(Munich), Mikolaj Bojanczyk (Warsaw), Luis Caires (Lisbon), Jurriaan Hage 
(Utrecht), Rainer Hahnle (Darmstadt), Reiko Heckel (Leicester), Marieke Huisman 
(Twente), Panagiotis Katsaros (Thessaloniki), Ralf Kiisters (Stuttgart), Ugo Dal Lago 
(Bologna), Kim G. Larsen (Aalborg), Matteo Maffei (Vienna), Tiziana Margaria 
(Limerick), Flemming Nielson (Copenhagen), Catuscia Palamidessi (Palaiseau), 
Andrew M. Pitts (Cambridge), Alessandra Russo (London), Dave Sands (Göteborg), 
Don Sannella (Edinburgh), Andy Schiirr (Darmstadt), Alex Simpson (Ljubljana), 
Gabriele Taentzer (Marburg), Peter Thiemann (Freiburg), Jan Vitek (Prague), Tomas 
Vojnar (Brno), and Lijun Zhang (Beijing). 

I would like to take this opportunity to thank all speakers, attendees, organizers 
of the satellite workshops, and Springer for their support. I hope you all enjoy the 
proceedings of ETAPS 2018. Finally, a big thanks to Panagiotis and his local orga- 
nization team for all their enormous efforts that led to a fantastic ETAPS in 
Thessaloniki! 


February 2018 Joost-Pieter Katoen 


Preface 


This volume contains the papers presented at POST 2018, the 7th Conference on 
Principles of Security and Trust, held April 16-17, 2018, in Thessaloniki, Greece, as 
part of ETAPS. Principles of Security and Trust is a broad forum related to all theo- 
retical and foundational aspects of security and trust, and thus welcomes papers of 
many kinds: new theoretical results, practical applications of existing foundational 
ideas, and innovative approaches stimulated by pressing practical problems; as well as 
systemization-of-knowledge papers, papers describing tools, and position papers. 
POST was created in 2012 to combine and replace a number of successful and 
long-standing workshops in this area: Automated Reasoning and Security Protocol 
Analysis (ARSPA), Formal Aspects of Security and Trust (FAST), Security in Con- 
currency (SecCo), and the Workshop on Issues in the Theory of Security (WITS). 
A subset of these events met jointly as an event affiliated with ETAPS 2011 under the 
name “Theory of Security and Applications” (TOSCA). 

There were 45 submissions to POST 2018. Each submission was reviewed by at 
least three Program Committee members, who in some cases solicited the help of 
outside experts to review the papers. We employed a double-blind reviewing process 
with a rebuttal phase. Electronic discussion was used to decide which papers to select 
for the program. The committee decided to accept 14 papers, including one SoK paper 
and one tool demonstration paper. 

We would like to thank the members of the Program Committee, the additional 
reviewers, the POST Steering Committee, the ETAPS Steering Committee, and the 
local Organizing Committee, who all contributed to the success of POST 2018. We 
also thank all authors of submitted papers for their interest in POST and congratulate 
the authors of accepted papers. 


March 2018 Lujo Bauer 
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The Science of Deep Specification 
(Abstract of Invited Talk) 


Benjamin C. Pierce® 


University of Pennsylvania 


Formal specifications significantly improve the security and robustness of critical, 
low-level software and hardware, especially when deeply integrated into the processes 
of system engineering and design [4]. Such “deep specifications” can also be chal- 
lenging to work with, since they must be simultaneously rich (describing complex 
component behaviors in detail), two-sided (connected to both implementations and 
clients), and live (connected directly to the source code of implementations via 
machine-checkable proofs and/or automated testing). 

The DeepSpec project [1] is a multi-institution effort to develop experience with 
building and using serious specifications at many architectural levels—hardware 
instruction-set architectures (MIT), hypervisor kernels (Yale), C semantics (Princeton, 
Yale), compilers for both C (Penn, Princeton, Yale) and functional languages (Penn, 
Princeton), cryptographic operations (Princeton, MIT), and web infrastructure (Penn)— 
and to create new tools for machine-assisted formal verification [2, 3, 5] and 
specification-based testing [6], all within the Coq ecosystem. 

To exercise several of these specifications together, we are building a formally 
specified, tested, and verified web server. Our goal is a “single Q.E.D.” spanning all 
levels of the system—from an executable specification of correct server behavior in 
terms of valid sequences of HTTP requests and responses, all the way down to an RTL 
description of a RISC-V chip and the binary code for a hypervisor running on that chip. 
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What’s the Over/Under? Probabilistic 
Bounds on Information Leakage 


Ian Sweet, José Manuel Calderón Trilla?, Chad Scherrer?, Michael Hickst, 
and Stephen Magill? © 


1 University of Maryland, College Park, USA 
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Abstract. Quantitative information flow (QIF) is concerned with mea- 
suring how much of a secret is leaked to an adversary who observes the 
result of a computation that uses it. Prior work has shown that QIF 
techniques based on abstract interpretation with probabilistic polyhedra 
can be used to analyze the worst-case leakage of a query, on-line, to 
determine whether that query can be safely answered. While this app- 
roach can provide precise estimates, it does not scale well. This paper 
shows how to solve the scalability problem by augmenting the baseline 
technique with sampling and symbolic execution. We prove that our app- 
roach never underestimates a query’s leakage (it is sound), and detailed 
experimental results show that we can match the precision of the baseline 
technique but with orders of magnitude better performance. 


1 Introduction 


As more sensitive data is created, collected, and analyzed, we face the problem 
of how to productively use this data while preserving privacy. One approach to 
this problem is to analyze a query f in order to quantify how much information 
about secret input s is leaked by the output f(s). More precisely, we can consider 
a querier to have some prior belief of the secret’s possible values. The belief can 
be modeled as a probability distribution [10], i.e., a function ô from each possible 
value of s to its probability. When a querier observes output o = f(s), he revises 
his belief, using Bayesian inference, to produce a posterior distribution 6’. If 
the posterior could reveal too much about the secret, then the query should be 
rejected. One common definition of “too much” is Bayes Vulnerability, which is 
the probability of the adversary guessing the secret in one try [41]. Formally, 


V(5) = max; 6(i) 


Various works [6,19,24,25] propose rejecting f if there exists an output that 
makes the vulnerability of the posterior exceed a fixed threshold K. In particular, 
for all possible values i of s (i.e., 6(z) > 0), if the output o = f(t) could induce 
a posterior 6’ with V(0’) > K, then the query is rejected. 


© The Author(s) 2018 
L. Bauer and R. Kiisters (Eds.): POST 2018, LNCS 10804, pp. 3-27, 2018. 
https: //doi.org/10.1007/978-3-319-89722-6_1 
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One way to implement this approach is to estimate f(d)—the distribution 
of f’s outputs when the inputs are distributed according to 6—by viewing f as 
a program in a probabilistic programming language (PPL) [18]. Unfortunately, 
as discussed in Sect.9, most PPLs are approximate in a manner that could 
easily result in underestimating the vulnerability, leading to an unsafe security 
decision. Techniques designed specifically to quantify information leakage often 
assume only uniform priors, cannot compute vulnerability (favoring, for example, 
Shannon entropy), and/or cannot maintain assumed knowledge between queries. 

Mardziel et al. [25] propose a sound analysis technique based on abstract 
interpretation [12]. In particular, they estimate a program’s probability distri- 
bution using an abstract domain called a probabilistic polyhedron (PP), which 
pairs a standard numeric abstract domain, such as convex polyhedra [13], with 
some additional ornaments, which include lower and upper bounds on the size of 
the support of the distribution, and bounds on the probability of each possible 
secret value. Using PP can yield a precise, yet safe, estimate of the vulner- 
ability, and allows the posterior PP (which is not necessarily uniform) to be 
used as a prior for the next query. Unfortunately, PPs can be very inefficient. 
Defining intervals [11] as the PP’s numeric domain can dramatically improve 
performance, but only with an unacceptable loss of precision. 

In this paper we present a new approach that ensures a better balance of both 
precision and performance in vulnerability computation, augmenting PP with 
two new techniques. In both cases we begin by analyzing a query using the fast 
interval-based analysis. Our first technique is then to use sampling to augment 
the result. In particular, we execute the query using possible secret values i 
sampled from the posterior 6’ derived from a particular output o;. If the analysis 
were perfectly accurate, executing f(i) would produce o;. But since intervals are 
overapproximate, sometimes it will not. With many sampled outcomes, we can 
construct a Beta distribution to estimate the size of the support of the posterior, 
up to some level of confidence. We can use this estimate to boost the lower bound 
of the abstraction, and thus improve the precision of the estimated vulnerability. 

Our second technique is of a similar flavor, but uses symbolic reasoning to 
magnify the impact of a successful sample. In particular, we execute a query 
result-consistent sample concolically [39], thus maintaining a symbolic formula 
(called the path condition) that characterizes the set of variable valuations that 
would cause execution to follow the observed path. We then count the number 
of possible solutions and use the count to boost the lower bound of the support 
(with 100% confidence). 

Sampling and concolic execution can be combined for even greater precision. 

We have formalized and proved our techniques are sound (Sects. 3-6) and 
implemented and evaluated them (Sects. 7 and 8). Using a privacy-sensitive ship 
planning scenario (Sect. 2) we find that our techniques provide similar precision 
to convex polyhedra while providing orders-of-magnitude better performance. 
More experiments are needed to see if the approach provides such benefits more 
generally. Our implementation freely available at https: //github.com/GaloisInc/ 
TAMBA. 
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Field Type Range Private? 
ShipID Integer 1-10 No 
NationID | Integer 1-20 No 
Capacity | Integer 0-1000 Yes 
Latitude | Integer -900,000—900,000 Yes 
Longitude | Integer -1,800,000-1,800,000 Yes 


Fig. 1. The data model used in the evacuation scenario. 


2 Overview 


To provide an overview of our approach, we will describe the application of our 
techniques to a scenario that involves a coalition of ships from various nations 
operating in a shared region. Suppose a natural disaster has impacted some 
islands in the region. Some number of individuals need to be evacuated from 
the islands, and it falls to a regional disaster response coordinator to determine 
how to accomplish this. While the coalition wants to collaborate to achieve 
these humanitarian aims, we assume that each nation also wants to protect 
their sensitive data—namely ship locations and capacity. 

More formally, we assume the use of the data model shown in Fig. 1, which 
considers a set of ships, their coalition affiliation, the evacuation capacity of the 
ship, and its position, given in terms of latitude and longitude.! We sometimes 
refer to the latter two as a location L, with L.x as the longitude and L.y as the 
latitude. We will often index properties by ship ID, writing Capacity(z) for the 
capacity associated with ship ID z, or Location(z) for the location. 

The evacuation problem is defined as follows 


Given a target location L and number of people to evacuate N, compute 


a set of nearby ships S such that >),.¢ Capacity(z) > N. 


Our goal is to solve this problem in a way that minimizes the vulnerability to 
the coordinator of private information, i.e., the ship locations and their exact 
capacity. We assume that this coordinator initially has no knowledge of the 
positions or capabilities of the ships other than that they fall within certain 
expected ranges. 

If all members of the coalition share all of their data with the coordinator, 
then a solution is easy to compute, but it affords no privacy. Figure2 gives 
an algorithm the response coordinator can follow that does not require each 
member to share all of their data. Instead, it iteratively performs queries AtLeast 
and Nearby. These queries do not reveal precise values about ship locations 
or capacity, but rather admit ranges of possibilities. The algorithm works by 
maintaining upper and lower bounds on the capacity of each ship i in the array 
berths. Each ship’s bounds are updated based on the results of queries about its 


1 We give latitude and longitude values as integer representations of decimal degrees 
fixed to four decimal places; e.g., 14.3579 decimal degrees is encoded as 143579. 
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capacity and location. These queries aim to be privacy preserving, doing a sort of 
binary search to narrow in on the capacity of each ship in the operating area. The 
procedure completes once is_solution determines the minimum required capacity 
is reached. 


(x S = #ships; N = #evacuees; L = island loc.; D = min. proximity to L *) 
let berths = Array.make S (0,1000) 
let is_solution () = sum (Array.map fst berths) >N 
let mid (x,y) = (x +y) /2 
let AtLeast(z,b) = Capacity(z) >b 
let Nearby(z,I,d) = |Loc(z).x — |.x| + |Loc(z).y — l.y| <d 
while true do 
for i = 0 to S do 
let ask = mid berths[i] 
let ok = AtLeast(i,ask) && Nearby(i,L,D) 
if ok then berths[i] < (ask, snd berths[i]) 


else berths[i] < (fst berths[i], ask) 
if is_solution () then return berths 
done 
done 


Fig. 2. Algorithm to solve the evacuation problem for a single island. 


2.1 Computing Vulnerability with Abstract Interpretation 


Using this procedure, what is revealed about the private variables (location and 
capacity)? Consider a single Nearby(z,l,d) query. At the start, the coordinator 
is assumed to know only that z is somewhere within the operating region. If 
the query returns true, the coordinator now knows that s is within d units of 
l (using Manhattan distance). This makes Location(z) more vulnerable because 
the adversary has less uncertainty about it. 

Mardziel et al. [25] proposed a static analysis for analyzing queries such as 
Nearby(z,l,d) to estimate the worst-case vulnerability of private data. If the 
worst-case vulnerability is too great, the query can be rejected. A key element 
of their approach is to perform abstract interpretation over the query using an 
abstract domain called a probabilistic polyhedron. An element P of this domain 
represents the set of possible distributions over the query’s state. This state 
includes both the hidden secrets and the visible query results. The abstract 
interpretation is sound in the sense that the true distribution 6 is contained in 
the set of distributions represented by the computed probabilistic polyhedron P. 

A probabilistic polyhedron P is a tuple comprising a shape and three orna- 
ments. The shape C is an element of a standard numeric domain—e.g., inter- 
vals [11], octagons [29], or convex polyhedra [13]—which overapproximates the 
set of possible values in the support of the distribution. The ornaments p € [0, 1], 
m €E R, and s € Z are pairs which store upper and lower bounds on the probabil- 
ity per point, the total mass, and number of support points in the distribution, 
respectively. (Distributions represented by P are not necessarily normalized, so 
the mass m is not always 1.) 
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Figure 3(a) gives an example probabilistic polyhedron that represents the 
posterior of a Nearby query that returns true. In particular, if Nearby(z,L1,D) 
is true then Location(z) is somewhere within the depicted diamond around L4. 
Using convex polyhedra or octagons for the shape domain would permit repre- 
senting this diamond exactly; using intervals would overapproximate it as the 
depicted 9 x 9 bounding box. The ornaments would be the same in any case: the 
size s of the support is 41 possible (x,y) points, the probability p per point is 
0.01, and the total mass is 0.41, i.e., p- s. In general, each ornament is a pair of 
a lower and upper bound (e.g., Smin and Smax), and m might be a more accurate 
estimate than p- s. In this case shown in the figure, the bounds are tight. 

Mardziel et al’s procedure works by computing the posterior P for each 
possible query output o, and from that posterior determining the vulnerability. 
This is easy to do. The upper bound pmax of p maximizes the probability of 
any given point. Dividing this by the lower bound Mmin of the probability mass 
m normalizes this probability for the worst case. For P shown in Fig. 3(a), the 
bounds of p and m are tight, so the vulnerability is simply 0.01/0.41 = 0.024. 


2.2 Improving Precision with Sampling and Concolic Execution 


In Fig. 3(a), the parameters s, p, and m are precise. However, as additional oper- 
ations are performed, these quantities can accumulate imprecision. For example, 
suppose we are using intervals for the shape domain, and we wish to analyze the 
query Nearby(z, L1, 4) V Nearby(z, L2,4) (for some nearby point Lə). The result 
is produced by analyzing the two queries separately and then combining them 
with an abstract join; this is shown in the top row of Fig. 3(b). Unfortunately, 
the result is very imprecise. The bottom row of Fig. 3(b) illustrates the result we 
would get by using convex polyhedra as our shape domain. When using intervals 
(top row), the vulnerability is estimated as 0.036, whereas the precise answer 
(bottom row) is actually 0.026. Unfortunately, obtaining this precise answer is 
far more expensive than obtaining the imprecise one. 

This paper presents two techniques that can allow us to use the less pre- 
cise interval domain but then recover lost precision in a relatively cheap post- 
processing step. The effect of our techniques is shown in the middle-right of 
Fig. 3(b). Both techniques aim to obtain better lower bounds for s. This allows 
us to update lower bounds on the probability mass m since Mmin is at least 
Smin ‘ Pmin (each point has at least probability Pmin and there are at least Smin 
of them). A larger m means a smaller vulnerability. 

The first technique we explore is sampling, depicted to the right of the arrow 
in Fig. 3(b). Sampling chooses random points and evaluates the query on them 
to determine whether they are in the support of the posterior distribution for a 
particular query result. By tracking the ratio of points that produce the expected 
output, we can produce an estimate of s, whose confidence increases as we include 
more samples. This approach is depicted in the figure, where we conclude that 
s € [72,81] and m € [0.72, 1.62] with 90% confidence after taking 1000 samples, 

0.02 


improving our vulnerability estimate to V < 975 = 0.028. 
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number of points 


‘th N s € [41,41] 
y N 


probability per point 
f p € [0.01,0.01] 


total probability mass 
\ y m e [0.41,0.41] 


Upper bound on max probability 
/ Man = 0.01 / 0.41 = 0.024 


Pmax min 


(a) Probabilistic polyhedra 


Abstraction 


s e [4,41] s e [4,41] 


s e [55,82] 
(0.01,0.01 (0.01,0.01 
pel ] pel l p € [0.01,0.02] 
m e [0.41,0.41] m e [0.41,0.41] 
m e [0.55,1.64] 
Sound 
Result 


Max probability < 0.02 / 0.55 = 0.036 


27 pts in overlap 
Under Approximation 1 Sampling 


s263 in = 570, out = 430 
s € [72,81] (90% cred.) 


Max probability < 0.028 


Max probability < 0.032 


Precise Representation 


y 


k. 


Precise 
Result 


Max probability < 0.02 / 0.77 = 0.026 


(b) Improving precision with sampling and underapproximation (concolic execution) 


Fig. 3. Computing vulnerability (max probability) using abstract interpretation 
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Variables x € Var 

Integers n E Z 

Rationals q E Q 

States o € State = Var = Z 

Distributions 6 Dist “ State > R Fo 

Arith.ops aop ::=+|x |- 

Rel. ops relop := <| <| =||- 

Arith.exps E z= x | n | Ei aop Ez 

Bool.exps B n= E relop E2 | By A B2 | Bi V B2 | =B 
Statements S n= skip |x := E | S1; S2| while B do S | 


if B then Sı else S2 | pif q then Sı else Sz 


Fig. 4. Core language syntax 


The second technique we explore is the use of concolic execution to derive 
a path condition, which is a formula over secret values that is consistent with a 
query result. By performing model counting to estimate the number of solutions 
to this formula, which are an underapproximation of the true size of the distri- 
bution, we can safely boost the lower bound of s. This approach is depicted to 
the left of the arrow in Fig. 3(b). The depicted shapes represent discovered path 
condition’s disjuncts, whose size sums to 63. This is a better lower bound on s 
and improves the vulnerability estimate to 0.032. 

These techniques can be used together to further increase precision. In partic- 
ular, we can first perform concolic execution, and then sample from the area not 
covered by this underapproximation. Importantly, Sect. 8 shows that using our 
techniques with the interval-based analysis yields an orders of magnitude perfor- 
mance improvement over using polyhedra-based analysis alone, while achieving 
similar levels of precision, with high confidence. 


3 Preliminaries: Syntax and Semantics 


This section presents the core language—syntax and semantics—in which we 
formalize our approach to computing vulnerability. We also review probabilistic 
polyhedra [25], which is the baseline analysis technique that we augment. 


3.1 Core Language and Semantics 


The programming language we use for queries is given in Fig. 4. The language 
is essentially standard, apart from pif q then S4 else S2, which implements prob- 
abilistic choice: S4 is executed with probability q, and S2 with probability 1 — q. 
We limit the form of expressions F so that they can be approximated by stan- 
dard numeric abstract domains such as convex polyhedra [13]. Such domains 
require linear forms; e.g., there is no division operator and multiplication of two 
variables is disallowed.? 


? Relaxing such limitations is possible—e.g., polynominal inequalities can be approxi- 
mated using convex polyhedra [5]—but doing so precisely and scalably is a challenge. 
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We define the semantics of a program in terms of its effect on (discrete) 
distributions of states. States o are partial maps from variables to integers; we 
write domain(c) for the set of variables over which ø is defined. Distributions 6 
are maps from states to nonnegative real numbers, interpreted as probabilities 
(in range [0,1]). The denotational semantics considers a program as a relation 
between distributions. In particular, the semantics of statement S, written [S], 
is a function of the form Dist — Dist; we write [S] = 0’ to say that the 
semantics of S maps input distribution 6 to output distribution 6’. Distributions 
are not necessarily normalized; we write ||6|| as the probability mass of ô (which 
is between 0 and 1). We write o to denote the point distribution that gives ø 
probability 1, and all other states 0. 

The semantics is standard and not crucial in order to understand our tech- 
niques. In Appendix B we provide the semantics in full. See Clarkson et al. [10] 
or Mardziel et al. [25] for detailed explanations. 


3.2 Probabilistic Polyhedra 


To compute vulnerability for a program S we must compute (an approximation 
of) its output distribution. One way to do that would be to use sampling: Choose 
states o at random from the input distribution ô, “run” the program using that 
input state, and collect the frequencies of output states a’ into a distribution 6’. 
While using sampling in this manner is simple and appealing, it could be both 
expensive and imprecise. In particular, depending on the size of the input and 
output space, it may take many samples to arrive at a proper approximation of 
the output distribution. 

Probabilistic polyhedra [25] can address both problems. This abstract domain 
combines a standard domain C for representing numeric program states with 
additional ornaments that all together can safely represent S’s output distribu- 
tion. 

Probabilistic polyhedra work for any numeric domain; in this paper we use 
both convex polyhedra [13] and intervals [11]. For concreteness, we present the 
definition using convex polyhedra. We use the meta-variables 8, 31, G2, etc. to 
denote linear inequalities. 


Definition 1. A convex polyhedron C = (B,V) is a set of linear inequalities 
B = {61,..., Bm}, interpreted conjunctively, over variables V. We write C for 
the set of all convex polyhedra. A polyhedron C represents a set of states, denoted 
yc(C), as follows, where o = B indicates that the state o satisfies the inequal- 
ity b. 


ye((B,V)) = {o : domain(s) = V, YB € B. o & B} 


Naturally we require that domain({G1,...,Gn}) C V; te, V mentions all 
variables in the inequalities. Let domain((B,V)) = V. 


Probabilistic polyhedra extend this standard representation of sets of pro- 
gram states to sets of distributions over program states. 
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Definition 2. A probabilistic polyhedron P is a tuple (C,s™™, s™*,p™™, 
pzm™" m™*), We write P for the set of probabilistic polyhedra. The quan- 
tities s™™ and s™* are lower and upper bounds on the number of support points 
in the concrete distribution(s) P represents. A support point of a distribution 
is one which has non-zero probability. The quantities p™™ and p™* are lower 
and upper bounds on the probability mass per support point. The m™” and m™* 
components give bounds on the total probability mass (i.e., the sum of the prob- 
abilities of all support points). Thus P represents the set of distributions yp(P) 
defined below. 


(P) = {6 : support(d) C yc(C) A 
smin < |support(d)| < s™®™ A 
mn < Ils < mA 
Vo € support(5). p™™ < 6(a) < p™™*} 


We will write domain(P) = domain(C) to denote the set of variables used 
in the probabilistic polyhedron. 


Note the set yp(P) is a singleton exactly when s™" = s™* = #(C) and 
pm = pm and m™” = m™*, where #(C) denotes the number of discrete 
points in convex polyhedron C. In such a case yp(P) contains only the uniform 
distribution where each state in yc(C) has probability p™™”. In general, however, 
the concretization of a probabilistic polyhedron will have an infinite number of 
distributions, with per-point probabilities varied somewhere in the range p™™” 
and p™**. Distributions represented by a probabilistic polyhedron are not nec- 
essarily normalized. In general, there is a relationship between p™™,s™", and 
m™in in that m™™ > p™m.s™n (and m™* < p™ax . gmax) and the combination 
of the three can yield more information than any two in isolation. 

The abstract semantics of S is written ((S))P = P’, and indicates that 
abstractly interpreting S where the distribution of input states are approximated 
by P will produce P’, which approximates the distribution of output states. 
Following standard abstract interpretation terminology, Dist (sets of distribu- 
tions) is the concrete domain, P is the abstract domain, and yp : P > Dist is 
the concretization function for P. We do not present the abstract semantics here; 
details can be found in Mardziel et al. [25]. Importantly, this abstract semantics 
is sound: 


Theorem 1 (Soundness). For all S, Pı, P2,ô1,02, if 61 E€ y(Pi) and 
(SY P = Þs, then [S] = ô> with b9 € yp(Pə). 


Proof. See Theorem 6 in Mardziel et al. [25]. 


Consider the example from Sect. 2.2. We assume the adversary has no prior 
information about the location of ship s. So, 6; above is simply the uniform dis- 
tribution over all possible locations. The statement S$ is the query issued by the 
adversary, Nearby(z, L1, 4)V Nearby(z, Lz, 4).° If we assume that the result of the 


3 Appendix A shows the code, which computes Manhattan distance between s and Lı 
and Lz and then sets an output variable if either distance is within four units. 
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query is |true| then the adversary learns that the location of s is within (Man- 
hattan) distance 4 of Lı or Lə. This posterior belief (62) is represented by the 
overlapping diamonds on the bottom-right of Fig. 3(b). The abstract interpreta- 
tion produces a sound (interval) overapproximation (P2) of the posterior belief. 
This is modeled by the rectangle which surrounds the overlapping diamonds. 
This rectangle is the “join” of two overlapping boxes, which each correspond to 
one of the Nearby calls in the disjuncts of S. 


4 Computing Vulnerability: Basic Procedure 


The key goal of this paper is to quantify the risk to secret information of running 
a query over that information. This section explains the basic approach by which 
we can use probabilistic polyhedra to compute vulnerability, i.e., the probability 
of the most probable point of the posterior distribution. Improvements on this 
basic approach are given in the next two sections. 

Our convention will be to use C4, s@¥™, s¥°*, etc. for the components associ- 
ated with probabilistic polyhedron P4. Ta de program S$ of interest, we assume 
that secret variables are in the set T, so input states are written or, and we 
assume there is a single output variable r. We assume that the adversary’s ini- 
tial uncertainty about the possible values of the secrets T is captured by the 
probabilistic polyhedron Po (such that domain( Po) 2 T). 

Computing vulnerability occurs according to the following procedure. 


1. Perform abstract interpretation: (8) Py = P 
2. Given a concrete output value of interest, o, perform abstract conditioning 


to define P,-J = (PAr=o).4 


The vulnerability V is the probability of the most likely state(s). When a prob- 
abilistic polyhedron represents one or more true distributions (i.e., the proba- 
bilities all sum to 1), the most probable state’s probability is bounded by p™™*. 
However, the abstract semantics does not always normalize the probabilistic 
polyhedron as it computes, so we need to scale p™** according to the total prob- 
ability mass. To ensure that our a is on the safe side, we scale p™** using 


the minimum probability mass: V = In Fig. 3(b), the sound approxima- 


Pa 7 


tion in the top-right has V < = = 0.036 and the most precise approximation 


0.02 _ 
in the bottom-right has V < p75 = 0.026. 


5 Improving Precision with Sampling 


We can improve the precision of the basic procedure using sampling. First we 
introduce some notational convenience: 


Pp = PA(r=o0)|T 
Pr Œ Pr revised polyhedron with confidence w 


4 We write P A B and not P | B because P need not be normalized. 
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Pr is equivalent to step 2, above, but projected onto the set of secret variables 
T. Pr, is the improved (via sampling) polyhedron. 

After computing Pr with the basic procedure from the previous section we 
take the following additional steps: 


1. Set counters a and ĝ to zero. 
2. Do the following N times (for some N, see below): 
(a) Randomly select an input state or € yc(Cr). 
(b) “Run” the program by computing [S]or = 6. If there exists o € 
support(o) with o(r) = o then increment a, else increment /3. 

3. We can interpret a and @ as the parameters of a Beta distribution of the 
likelihood that an arbitrary state in yc(Cr) is in the support of the true 
distribution. From these parameters we can compute the credible interval 
[pL, pu] within which is contained the true likelihood, with confidence w 
(where 0 < w < 1). A credible interval is essentially a Bayesian analogue 
of a confidence interval and can be computed from the cumulative distri- 
bution function (CDF) of the Beta distribution (the 99% credible interval 
is the interval [a,b] such that the CDF at a has value 0.005 and the CDF 
at b has value 0.995). In general, obtaining a higher confidence or a nar- 
rower interval will require a higher N. Let result Pp, = Pr except that 
st = pr : #(Cr) and sf%* = py - #(Cr) (assuming these improve on sj” 
and s77**). We can then propagate these improvements to m™” and m™** by 
defining mP? = pz” smin and max = pmax gmax, Note that if mp™ > mpi 
we leave it unchanged, and do likewise if m7** < m7pi™. 


At this point we can compute the vulnerability as in the basic procedure, but 
using Pry instead of Pr. 

Consider the example of Sect. 2.2. In Fig.3(b), we draw samples from the 
rectangle in the top-right. This rectangle overapproximates the set of locations 
where s might be, given that the query returned true. We sample locations 
from this rectangle and run the query on each sample. The green (red) dots 
indicate true (false ) results, which are added to a (3). After sampling N = 1000 
locations, we have a = 570 and 8 = 430. Choosing w = .9 (90%), we compute 
the credible interval [0.53, 0.60]. With #(Cr) = 135, we compute [s71!",s7?4*] as 
[0.53 - 135, 0.60 - 135] = [72,81]. 

There are several things to notice about this procedure. First, observe that in 
step 2b we “run” the program using the point distribution ò as an input; in the 
case that S is deterministic (has no pif statements) the output distribution will 
also be a point distribution. However, for programs with pif statements there 
are multiple possible outputs depending on which branch is taken by a pif. We 
consider all of these outputs so that we can confidently determine whether the 
input state g could ever cause S$ to produce result o. If so, then ø should be 
considered part of Pr. If not, then we can safely rule it out (i.e., it is part of 
the overapproximation). 

Second, we only update the size parameters of Pr}; we make no changes to 
pre and pr’. This is because our sampling procedure only determines whether 
it is possible for an input state to produce the expected output. The probability 
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that an input state produces an output state is already captured (soundly) by pr 
so we do not change that. This is useful because the approximation of pr does 
not degrade with the use of the interval domain in the way the approximation 
of the size degrades (as illustrated in Fig. 3(b)). Using sampling is an attempt 
to regain the precision lost on the size component (only). 

Finally, the confidence we have that sampling has accurately assessed which 
input states are in the support is orthogonal to the probability of any given state. 
In particular, Pp is an abstraction of a distribution ôr, which is a mathematical 
object. Confidence w is a measure of how likely it is that our abstraction (or, at 
least, the size part of it) is accurate. 

We prove (in our extended report [43]) that our sampling procedure is sound: 


Theorem 2 (Sampling is Sound). Jf do € yp(Po), (S)) Po = P, and [S]oo = 
ô then ôr € yp(Pr+) with confidence w where 


6p = 5A(r=0)|T 
Pr = PA(r=0)|T 


Pry = Pr sampling revised with confidence w. 


6 Improving Precision with Concolic Execution 


Another approach to improving the precision of a probabilistic polyhedron P is 
to use concolic execution. The idea here is to “magnify” the impact of a single 
sample to soundly increase s™™" by considering its execution symbolically. More 
precisely, we concretely execute a program using a particular secret value, but 
maintain symbolic constraints about how that value is used. This is referred to 
as concolic execution [39]. We use the collected constraints to identify all points 
that would induce the same execution path, which we can include as part of s™™”. 

We begin by defining the semantics of concolic execution, and then show how 
it can be used to increase s™™” soundly. 


6.1 (Probabilistic) Concolic Execution 


Concolic execution is expressed as rewrite rules defining a judgment (II, S) —>? 
(1', S’). Here, IT is pair consisting of a concrete state o and symbolic state Ç. 
The latter maps variables x € Var to symbolic expressions E which extend 
expressions E with symbolic variables a. This judgment indicates that under 
input state I the statement S reduces to statement S’ and output state IT’ 
with probability p, with path condition m. The path condition is a conjunction 
of boolean symbolic expressions B (which are just boolean expressions B but 
altered to use symbolic expressions E instead of expressions Æ) that record which 
branch is taken during execution. For brevity, we omit 7 in a rule when it is true. 

The rules for the concolic semantics are given in Fig.5. Most of these are 
standard, and deterministic (the probability annotation p is 1). Path conditions 
are recorded for if and while, depending on the branch taken. The semantics of 
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(osx = E) = (ale = gE), ce > CCE), skip) 
((o,¢),if B then Sı else S2) en) ((a,¢),S1) if o(B) 
((o,¢), if B then Sı else S2) eB) ((a, C), S2) if o(-B) 
(II, pif q then Sı else S2) T HT, Si) 

(IT; pif q then Sı a S2) — “ENI, S2) 

(II, Si; ; S2) — x (T', Si ; ; S2) if (I, S1) -5i (IT', Si) 
(II, skip ; S) —+* (I, 8) 

(II, while B do S} en) (II, S ; while B do S} if o(B) 
(II, while B do 8) — B) (II, skip) if o(=B) 


Fig. 5. Concolic semantics 


pif q then Sı else Sp is non-deterministic: the result is that of Sı with probability 
q, and Sj with probability 1 — q. We write ¢(B) to substitute free variables 
x € B with their mapped-to values ¢(#) and then simplify the result as much 
as possible. For example, if ¢(#) = a and ¢(y) = 2, then ¢(a > y+ 3) =a >5. 
The same goes for (E). 

We define a complete run of the concolic semantics with the judgment 
(IT, S} 42 I’, which has two rules: 


(I, skip) a IT 


(IT, $) —% (S) (I, 8") 44, T" 
(I, 5) I” 


oe 


A complete run’s probability is thus the product of the probability of each indi- 
vidual step taken. The run’s path condition is the conjunction of the conditions 
of each step. 

The path condition 7 for a complete run is a conjunction of the (symbolic) 
boolean guards evaluated during an execution. 7 can be converted to disjunctive 
normal form (DNF), and given the restrictions of the language the result is 
essentially a set of convex polyhedra over symbolic variables a. 


6.2 Improving Precision 


Using concolic execution, we can improve our estimate of the size of a proba- 
bilistic polyhedron as follows: 


1. Randomly select an input state or € yc(Cr) (recall that Cr is the polyhedron 
describing the possible valuations of secrets T). 

2. Set IT = (or,¢r) where r maps each variable x € T to a fresh symbolic 
variable a@,. Perform a complete concolic run (HM, S) 42 (0’,¢’). Make sure 
that o’(r) = o, i.e., the expected output. If not, select a new op and retry. 
Give up after some number of failures N. For our example shown in Fig. 3(b), 
we might obtain a path condition |Loc(z).« — Ly.x2| + |Loc(z).y — Li.y| < 4 
that captures the left diamond of the disjunctive query. 
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3. After a successful concolic run, convert path condition m to DNF, where each 
conjunctive clause is a polyhedron C;. Also convert uses of disequality (< and 
>) to be strict (< and >). 

4. Let C = CrN (Ll; Ci); that is, it is the join of each of the polyhedra in 
DNF(z) “intersected” with the original constraints. This captures all of the 
points that could possibly lead to the observed outcome along the concolically 
executed path. Compute n = #(C). Let Pry = Pr except define si" = n if 
se < n and mel? = pr" -n if mp < prim-n. (Leave them as is, otherwise.) 
For our example, n = 41, the size of the left diamond. We do not update s7"" 
since 41 < 55, the probabilistic polyhedron’s lower bound (but see below). 


Theorem 3 (Concolic Execution is Sound). If ôo € ye(Po), (S)) Po = P, 
and [S]6o = 6 then ôr € yp(Pr+) where 


ôr = 5A(r=o0)|T 


Pr = PA(r=o0)|T 


Pry = Pr concolically revised. 


The proof is in the extended technical report [43]. 


6.3 Combining Sampling with Concolic Execution 


Sampling can be used to further augment the results of concolic execution. The 
key insight is that the presence of a sound under-approximation generated by 
the concolic execution means that it is unnecessary to sample from the under- 
approximating region. Here is the algorithm: 


1. Let C= Co N (LJ; Ci) be the under-approximating region. 
2. Perform sampling per the algorithm in Sect. 5, but with two changes: 
— if a sampled state or € yc(C'), ignore it 
— When done sampling, compute s#" = pz - (#(Cr) — #(C)) + #(C) and 
sp = pu: (#(Cr)—#(C))+#(C). This differs from Sect. 5 in not includ- 
ing the count from concolic region C in the computation. This is because, 
since we ignored samples or € yc(C), the credible interval [pr, py] bounds 
the likelihood that any given point in Cr \ C is in the support of the true 
distribution. 


For our example, concolic execution indicated there are at least 41 points that 
satisfy the query. With this in hand, and using the same samples as shown in 
Sect.5, we can refine s € [74,80] and m € [0.74,0.160] (the credible interval is 
formed over only those samples which satisfy the query but fall outside the under- 
approximation returned by concolic execution). We improve the vulnerability 
estimate to V < 2:02; = 0.027. These bounds (and vulnerability estimate) are 
better than those of sampling alone (s € [72, 81] with V < 0.028). 

The statement of soundness and its proof can be found in the extended 
technical report [43]. 
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7 Implementation 


We have implemented our approach as an extension of Mardziel et al. [25], which 
is written in OCaml. This baseline implements numeric domains C via an OCaml 
interface to the Parma Polyhedra Library [4]. The counting procedure #(C) is 
implemented by LattE [15]. Support for arbitrary precision and exact arithmetic 
(e.g., for manipulating m™™, p™™, etc.) is provided by the mlgmp OCaml inter- 
face to the GNU Multi Precision Arithmetic library. Rather than maintaining 
a single probabilistic polyhedron P, the implementation maintains a powerset 
of polyhedra [3], i.e., a finite disjunction. Doing so results in a more precise 
handling of join points in the control flow, at a somewhat higher performance 
cost. 

We have implemented our extensions to this baseline for the case that domain 
C is the interval numeric domain [11]. Of course, the theory fully applies to any 
numeric abstract domain. We use Gibbs sampling, which we implemented our- 
selves. We delegate the calculation of the beta distribution and its corresponding 
credible interval to the ocephes OCaml library, which in turn uses the GNU 
Scientific Library. It is straightforward to lift the various operations we have 
described to the powerset domain. All of our code is available at https://github. 
com/GaloisInc/TAMBA. 


8 Experiments 


To evaluate the benefits of our techniques, we applied them to queries based 
on the evacuation problem outlined in Sect. 2. We found that while the base- 
line technique can yield precise answers when computing vulnerability, our new 
techniques can achieve close to the same level of precision far more efficiently. 


8.1 Experimental Setup 


For our experiments we analyzed queries similar to Nearby(s,1,d) from Fig. 2. 
We generalize the Nearby query to accept a set of locations L—the query returns 
true if s is within d units of any one of the islands having location | € L. In 
our experiments we fix d = 100. We consider the secrecy of the location of s, 
Location(s). We also analyze the execution of the resource allocation algorithm 
of Fig. 2 directly; we discuss this in Sect. 8.3. 

We measure the time it takes to compute the vulnerability (i.e., the prob- 
ability of the most probable point) following each query. In our experiments, 
we consider a single ship s and set its coordinates so that it is always in 
range of some island in L, so that the concrete query result returns true (i.e. 
Nearby(s,£,100) = true). We measure the vulnerability following this query 
result starting from a prior belief that the coordinates of s are uniformly dis- 
tributed with 0 < Location(s).2 < 1000 and 0 < Location(s).y < 1000. 

In our experiments, we varied several experimental parameters: analysis 
method (either P, I, CE, S, or CE+S), query complexity c; AI precision level 
p; and number of samples n. We describe each in turn. 
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Analysis Method. We compared five techniques for computing vulnerability: 


P: Abstract interpretation (AI) with convex polyhedra for domain C (Sect. 4), 
I: AI with intervals for C (Sect. 4), 

S: AI with intervals augmented with sampling (Sect. 5), 

CE: AI with intervals augmented with concolic execution (Sect. 6), and 
CE-+S: AI with intervals augmented with both techniques (Sect. 6.3) 


The first two techniques are due to Mardziel et al. [25], where the former uses 
convex polyhedra and the latter uses intervals (aka boxes) for the underlying 
polygons. In our experiments we tend to focus on P since Is precision is unac- 
ceptably poor (e.g., often vulnerability = 1). 


Query Complexity. We consider queries with different L; we say we are increasing 
the complexity of the query as L gets larger. Let c = |L|; we consider 1 < c <5, 
where larger L include the same locations as smaller ones. We set each location 
to be at least 2-d Manhattan distance units away from any other island (so 
diamonds like those in Fig. 3(a) never overlap). 


Precision. The precision parameter p bounds the size of the powerset abstract 
domain at all points during abstract interpretation. This has the effect of forcing 
joins when the powerset grows larger than the specified precision. As p grows 
larger, the results of abstract interpretation are likely to become more precise 
(i.e. vulnerability gets closer to the true value). We considered p values of 1, 2, 
4, 8, 16, 32, and 64. 


Samples Taken. For the latter three analysis methods, we varied the number of 
samples taken n. For analysis CE, n is interpreted as the number of samples 
to try per polyhedron before giving up trying to find a “valid sample.”° For 
analysis S, n is the number of samples, distributed proportionally across all the 
polyhedra in the powerset. For analysis CE+S, n is the combination of the two. 
We considered sample size values of 1,000 — 50,000 in increments of 1,000. We 
always compute an interval with w = 99.9% confidence (which will be wider when 
fewer samples are used). 


System Description. We ran experiments varying all possible parameters. For 
each run, we measured the total execution time (wall clock) in seconds to analyze 
the query and compute vulnerability. All experiments were carried out on a 
MacBook Air with OSX version 10.11.6, a 1.7GHz Intel Core i7, and 8 GB of 
RAM. We ran a single trial for each configuration of parameters. Only wall-clock 
time varies across trials; informally, we observed time variations to be small. 


8.2 Results 


Figure 6(a)—(c) measure vulnerability (y-axis) as a function of time (x-axis) for 
each analysis. These three figures characterize three interesting “zones” in the 


5 This is the N parameter from Sect. 6. 
6 These are best viewed on a color display. 
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Fig. 6. Experimental results 


space of complexity and precision. The results for method I are not shown in any 
of the figures. This is because I always produces a vulnerability of 1. The refine- 
ment methods (CE, S, and CE+S) are all over the interval domain, and should 
be considered as “improving” the vulnerability of I. 

In Fig. 6(a) we fix c = 1 and p = 1. In this configuration, baseline analysis 
P can compute the true vulnerability in ~ 0.95s. Analysis CE is also able to 
compute the true vulnerability, but in ~0.19s. Analysis S is able to compute a 
vulnerability to within ~5-e7® of optimal in ~0.15s. These data points support 
two key observations. First, even a very modest number of samples improves 
vulnerability significantly over just analyzing with intervals. Second, concolic 
execution is only slightly slower and can achieve the optimal vulnerability. Of 
course, concolic execution is not a panacea. As we will see, a feature of this 
configuration is that no joins take place during abstract interpretation. This is 
critical to the precision of the concolic execution. 

In Fig. 6(b) we fix c = 2 and p = 4. In contrast to the configuration of 
Fig. 6(a), the values for c and p in this configuration are not sufficient to prevent 
all joins during abstract interpretation. This has the effect of taking polygons 
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Table 1. Analyzing a 3-ship resource allocation run 


Resource allocation (3 ships) 


Analysis | Time (s) Vulnerability 
P Timeout (5 min) | N/A 

I 0.516 1 

CE 16.650 1.997 - 107” 
S 1.487 1.962 - 107-74 
CE+S 17.452 1.037 - 10774 


that represent individual paths through the program and joining them into a 
single polygon representing many paths. We can see that this is the case because 
baseline analysis P is now achieving a better vulnerability than CE. However, one 
pattern from the previous configuration persists: all three refinement methods 
(CE, S, CE+S) can achieve vulnerability within ~1-e~° of P, but in 4 the time. 
In contrast to the previous configuration, analysis CE+S is now able to make a 
modest improvement over CE (since it does not achieve the optimal). 

In Fig. 6(c) we fix c= 5 and p = 32. This configuration magnifies the effects 
we saw in Fig.6(b). Similarly, in this configuration there are joins happening, 
but the query is much more complex and the analysis is much more precise. 
In this figure, we label the X axis as a log scale over time. This is because 
analysis P took over two minutes to complete, in contrast to the longest-running 
refinement method, which took less than 6 seconds. The relationship between the 
refinement analyses is similar to the previous configuration. The key observation 
here is that, again, all three refinement analyses achieve within ~ 3 - e75 of P, 
but this time in 4% of the time (as opposed to + in the previous configuration). 

Figure 6(d) makes more explicit the relationship between refinements (CE, 
S, CE+S) and P. We fix n = 50,000 (the maximum) here, and p = 64 (the 
maximum). We can see that as query complexity goes up, P gets exponentially 
slower, while CE, S, and CE+S slow at a much lower rate, while retaining (per 
the previous graphs) similar precision. 


8.3 Evacuation Problem 


We conclude this section by briefly discussing an analysis of an execution of the 
resource allocation algorithm of Fig. 2. In our experiment, we set the number of 
ships to be three, where two were in range d = 300 of the evacuation site, and 
their sum-total berths (500) were sufficient to satisfy demand at the site (also 
500). For our analysis refinements we set n = 1000. Running the algorithm, a 
total of seven pairs of Nearby and Capacity queries were issued. In the end, the 
algorithm selects two ships to handle the evacuation. 

Table 1 shows the time to execute the algorithm using the different analysis 
methods, along with the computed vulnerability—this latter number represents 
the coordinator’s view of the most likely nine-tuple of the private data of the 
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three ships involved (x coordinate, y coordinate, and capacity for each). We can 
see that, as expected, our refinement analyses are far more efficient than baseline 
P, and far more precise than baseline I. The CE methods are precise but slower 
than S. This is because of the need to count the number of points in the DNF 
of the concolic path conditions, which is expensive. 


Discussion. The queries considered in Fig.6 have two features that contribute 
to the effectiveness of our refinement methods. First, they are defined over large 
domains, but return true for only a small subset of those values. For larger subsets 
of values, the benefits of sampling may degrade, though concolic execution should 
still provide an improvement. Further experiments are needed to explore such 
scenarios. Second, the example in Fig. 6 contains short but complex queries. A 
result of this query structure is that abstract interpretation with polyhedra is 
expensive but sampling can be performed efficiently. The evacuation problem 
results in Table 1 provide some evidence that the benefits of our techniques also 
apply to longer queries. However it may still be possible to construct queries 
where the gap in runtime between polyhedral analysis and sampling is smaller, 
in which case sampling would provide less improvement. 


9 Related Work 


Quantifying Information Flow. There is a rich research literature on techniques 
that aim to quantify information that a program may release, or has released, and 
then use that quantification as a basis for policy. One question is what measure 
of information release should be used. Past work largely considers information 
theoretic measures, including Bayes vulnerability [41] and Bayes risk [8], Shan- 
non entropy [40], and guessing entropy [26]. The g-vulnerability framework [1] 
was recently introduced to express measures having richer operational interpre- 
tations, and subsumes other measures. 

Our work focuses on Bayes Vulnerability, which is related to min-entropy. 
Vulnerability is appealing operationally: As Smith [41] explains, it estimates 
the risk of the secret being guessed in one try. While challenging to compute, 
this approach provides meaningful results for non-uniform priors. Work that has 
focused on other, easier-to-compute metrics, such as Shannon entropy and chan- 
nel capacity, require deterministic programs and priors that conform to uniform 
distributions [2,22,23,27,32]. The work of Klebanov [20] supports computation 
of both Shannon entropy and min-entropy over deterministic programs with 
non-uniform priors. The work takes a symbolic execution and program specifi- 
cation approach to QIF. Our use of concolic execution for counting polyhedral 
constraints is similar to that of Klebanov. However, our language supports prob- 
abilistic choice and in addition to concolic execution we also provide a sampling 
technique and a sound composition. Like Mardziel et al. [25], we are able to com- 
pute the worst-case vulnerability, i.e., due to a particular output, rather than a 
static estimate, i.e., as an expectation over all possible outputs. Kopf and Basin 
[21] originally proposed this idea, and Mardziel et al. were the first to implement 
it, followed by several others [6,19, 24]. 
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K6pf and Rybalchenko [22] (KR) also use sampling and concolic execution 
to statically quantify information leakage. But their approach is quite different 
from ours. KR uses sampling of a query’s inputs in lieu of considering (as we 
do) all possible outputs, and uses concolic execution with each sample to ulti- 
mately compute Shannon entropy, by underapproximation, within a confidence 
interval. This approach benefits from not having to enumerate outputs, but also 
requires expensive model counting for each sample. By contrast, we use sampling 
and concolic execution from the posterior computed by abstract interpretation, 
using the results to boost the lower bound on the size/probability mass of the 
abstraction. Our use of sampling is especially efficient, and the use of concolic 
execution is completely sound (i.e., it retains 100% confidence in the result). As 
with the above work, KR requires deterministic programs and uniform priors. 


Probabilistic Programming Langauges. A probabilistic program is essentially a 
lifting of a normal program operating on single values to a program operating 
on distributions of values. As a result, the program represents a joint distribu- 
tion over its variables [18]. As discussed in this paper, quantifying the informa- 
tion released by a query can be done by writing the query in a probabilistic 
programming language (PPL) and representing the uncertain secret inputs as 
distributions. Quantifying release generally corresponds to either the maximum 
likelihood estimation (MLE) problem or the maximum a-posteriori probability 
(MAP) problem. Not all PPLs support computation of MLE and MAP, but 
several do. 

PPLs based on partial sampling [17,34] or full enumeration [37] of the state 
space are unsuitable in our setting: they are either too inefficient or too impre- 
cise. PPLs based on algebraic decision diagrams [9], graphical models [28], and 
factor graphs [7,30,36] translate programs into convenient structures and take 
advantage of efficient algorithms for their manipulation or inference, in some 
cases supporting MAP or MLE queries (e.g. [33,35]). PSI [16] supports exact 
inference via computation of precise symbolic representations of posterior dis- 
tributions, and has been used for dynamic policy enforcement [24]. Guarnieri 
et al. [19] use probabilistic logic programming as the basis for inference; it scales 
well but only for a class of queries with certain structural limits, and which do 
not involve numeric relationships. 

Our implementation for probabilistic computation and inference differs from 
the above work in two main ways. Firstly, we are capable of sound approximation 
and hence can trade off precision for performance, while maintaining soundness 
in terms of a strong security policy. Even when using sampling, we are able 
to provide precise confidence measures. The second difference is our composi- 
tional representation of probability distributions, which is based on numerical 
abstractions: intervals [11], octagons [29], and polyhedra [13]. The posterior can 
be easily used as the prior for the next query, whereas prior work would have to 
repeatedly analyze the composition of past queries. 

A few other works have also focused on abstract interpretation, or related 
techniques, for reasoning about probabilistic programs. Monniaux [31] defines 
an abstract domain for distributions. Smith [42] describes probabilistic abstract 
interpretation for verification of quantitative program properties. Cousot [14] 
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unifies these and other probabilistic program analysis tools. However, these do 
not deal with sound distribution conditioning, which is crucial for belief-based 
information flow analysis. Work by Sankaranarayanan et al. [38] uses a combina- 
tion of techniques from program analysis to reason about distributions (includ- 
ing abstract interpretation), but the representation does not support efficient 
retrieval of the maximal probability, needed to compute vulnerability. 


10 Conclusions 


Quantitative information flow is concerned with measuring the knowledge about 
secret data that is gained by observing the answer to a query. This paper has 
presented a combination of static analysis using probabilistic abstract interpre- 
tation, sampling, and underapproximation via concolic execution to compute 
high-confidence upper bounds on information flow. Preliminary experimental 
results are promising and suggest that this approach can operate more precisely 
and efficiently than abstract interpretation alone. As next steps, we plan to eval- 
uate the technique more rigorously — including on programs with probabilistic 
choice. We also plan to integrate static analysis and sampling more closely so 
as to avoid precision loss at decision points in programs. We also look to extend 
programs to be able to store random choices in variables, to thereby implement 
more advanced probabilistic structures. 


A Query Code 


The following is the query code of the example developed in Sect. 2.2. Here, s_x 
and s_y represent a ship’s secret location. The variables I1_x, I1-y, 12.x,12-y, and 
d are inputs to the query. The first pair represents position Lı, the second pair 
represents the position Lz, and the last is the distance threshold, set to 4. We 
assume for the example that Lı and Lə have the same y coordinate, and their x 
coordinates differ by 6 units. 

We express the query in the language of Fig. 4 basically as follows: 


d_li := |s_x - li_x| + Is_y - li_yl; 
d_12 := |s_x - 12_x| + Is_y - 12_yl; 
if (d_l1 <= d || d_12 <= d) then 


out := true // assume this result 
else 
out := false 


The variable out is the result of the query. We simplify the code by assuming 
the absolute value function is built-in; we can implement this with a simple 
conditional. We run this query probabilistically under the assumption that sx 
and s_y are uniformly distributed within the range given in Fig.1. We then 
condition the output on the assumption that out = true. When using intervals 
as the baseline of probabilistic polyhedra, this produces the result given in the 
upper right of Fig. 3(b); when using convex polyhedra, the result is shown in the 
lower right of the figure. The use of sampling and concolic execution to augment 
the former is shown via arrows between the two. 
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B Formal Semantics 


Here we defined the probabilistic semantics for the programming language given 
in Fig. 4. The semantics of statement 5, written [S], is a function of the form Dist 
— Dist, i.e., it is a function from distributions of states to distributions of states. 
We write [S] = 0’ to say that the semantics of S maps input distribution 6 to 
output distribution 6’. 

Figure 7 gives this denotational semantics along with definitions of relevant 
auxiliary operations. We write [E]o to denote the (integer) result of evaluating 
expression £E in ø, and [B]o to denote the truth or falsehood of B in ø. The vari- 
ables of a state o, written domain(c), is defined by domain(c); sometimes we will 
refer to this set as just the domain of o. We will also use the this notation for distri- 
butions; domain(5) = domain(domain(6)). We write lfp as the least fixed-point 
operator. The notation >>, . gp can be read p is the sum over all x such that for- 
mula ¢ is satisfied (where x is bound in p and ¢). 

This semantics is standard. See Clarkson et al. [10] or Mardziel et al. [25] for 
detailed explanations. 


[skip]6 = 6 
[z := E]6 = ô [xr > E] 
[if B then Sı else S2]6 = [S1] (6 A B) + [S52] (6 A-B) 
[pif q then Sı else S2] = [Si](q- 8) + [S2]((1 — q) - ô) 
[S1 ; S2]o = [Se] (15118) 
[while B do S] = lfp [Af : Dist —> Dist. 20. 
f (S16 A B)) + (5A -B)] 


where 
def 
ô [x 7 E] T Xo. ye : T[x [| E]tT]=o 6(r) 
61 + 62 = do. 61(7) + d2(a) 
AB = do. if [B]o then 5(c) else 0 
def 
p:6 = do. p- d(c) 
def 
Il 2,40) 
normal(d) = TT 6 
6|B = normal(d A B) 
01 x d2 = (01, 02). 61(01) - 62(a2) 
fol = Noo. if o = co then 1 else 0 
a|V = dx € Vary. o(z) 
ôV = dov € Statev. Ð`, v-o, Ô(7) 


def 


fe (ô) = 6 | (domain() — {x}) 
support(d) = {o : d(a) > 0} 


Fig. 7. Distribution semantics 
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Abstract. One of the key demands of cyberphysical systems is that 
they meet their safety goals. Timed automata has established itself as 
a formalism for modeling and analyzing the real-time safety aspects 
of cyberphysical systems. Increasingly it is also demanded that cyber- 
physical systems meet a number of security goals for confidentiality and 
integrity. Notions of security based on Information flow control, such as 
non-interference, provide strong guarantees that no information is leaked; 
however, many cyberphysical systems leak intentionally some informa- 
tion in order to achieve their purposes. 

In this paper, we develop a formal approach of information flow for 
timed automata that allows intentional information leaks. The security 
of a timed automaton is then defined using a bisimulation relation that 
takes account of the non-determinism and the clocks of timed automata. 
Finally, we define an algorithm that traverses a timed automaton and 
imposes information flow constraints on it and we prove that our algo- 
rithm is sound with respect to our security notion. 


1 Introduction 


Motivation. Embedded systems are key components of cyberphysical systems 
and are often subject to stringent safety goals. Among the current approaches 
to the modeling and analysis of timed systems, the approach of timed automata 
[5] stands out as being a very successful approach with well-developed tool sup- 
port — in particular the UPPAAL suite [28] of tools. As cyberphysical systems 
become increasingly distributed and interconnected through wireless communi- 
cation links it becomes even more important to ensure that they meet suitable 
security goals. 

In this paper, we are motivated by an example of a smart power grid system. 
In its very basic form, a smart grid system consists of a meter that measures 
the electricity consumption in a customer’s (C) house and then sends this data 
to the utility company (UC). The detailed measurements of the meter provide 
more accurate billings for UC, while C receives energy management plans that 
optimize his energy consumption. Although this setting seems to be beneficial 
for both UC and C, it has been shown that high-frequent monitoring of the 
power flow poses a major threat to the privacy of C [14,23,27]. To deal with 
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this problem many smart grid systems introduce a trusted third-party (TTP), 
on which both UC and C agree [27]. The data of the meter now is collected by 
the TTP and by the end of each month the TTP charges C depending on the 
tariff prices defined by UC. In this protocol, UC trusts TTP for the accurate 
billing of C, while C trusts TTP with its sensitive data. However, in some cases, 
C may desire an energy management plan by UC, and consequently he makes 
a clear statement to TTP that allows the latter to release the private data of C 
to UC. Therefore, it is challenging to formally prove that our trusted smart grid 
system leaks information only under C’s decision. 


Information Flow Control. [10,26,29] is a key approach to ensuring that software 
systems maintain the confidentiality and/or integrity of their data. Policies for 
secure information flow are usually formalized as non-interference [29] properties 
and systems that adhere to the stated policy are guaranteed to admit no flow of 
information that violates it. However, in many applications information is leaked 
by intention as in our smart grid example. To deal with such systems, informa- 
tion flow control approaches are usually extended with mechanisms that permit 
controlled information leakage. The major difficulty imposed by this extension 
is to formalize notions of security that are able to differentiate between the 
intentional and the unintentional information leakages in a system. 


Contribution. It is therefore natural to extend the enforcement of safety prop- 
erties of timed automata with the enforcement of appropriate Information Flow 
policies. It is immediate that the treatment of clocks, the non-determinism, and 
the unstructured control flow inherent in automata will pose a challenge. More 
fundamentally there is the challenge that timed automata is an automata-based 
formalism whereas most approaches to Information Flow take a language-based 
approach by developing type systems for programming languages with structured 
control flow or process calculi. 

We start by giving the semantics of timed automata (Sect.2) based on the 
ones used in UPPAAL [28]. Next, we formalize the security of a timed automaton 
using a bisimulation relation (Sect. 3). This notion describes the observations of 
a passive attacker and formally describes where an observation is allowed to leak 
information and where it is not. To deal with implicit flows we define a general 
notion of the post-dominator relation [18] (Sect.4). We then develop a sound 
algorithm (Sect.5) that imposes information flow constraints on the clocks and 
the variables of a timed automaton. We finish with our conclusions (Sect. 6) and 
the proofs of our main results (Appendix). 


Related Work. There are other papers dealing with Information Flow using 
language based techniques for programs with a notion of time [2,9,16,22] or 
programs that leak information intentionally [6,13,19-21,24]. Our contribution 
focuses on the challenges of continuous time and the guarded actions of timed 
automata. 

The work of [7,8] define a notion of non-interference for timed automata 
with high-level (secret) and low-level (public) actions. Their notion of security is 
expressed as a non-interference property and it depends on a natural number m, 
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representing a minimum delay between high-level actions such that the low-level 
behaviors are not affected by the high-level ones. The authors of [17] define a 
notion of timed non-interference based on bisimulations for probabilistic timed 
automata which again have high-level (secret) and low-level (public) actions. 
A somewhat different approach is taken in [12] that studies the synthesis of 
controllers. None of those approaches considers timed automata that have data 
variables, nor is their notion of security able to accommodate systems that leak 
information intentionally. 

The authors of [25] take a language-based approach and they define a type- 
system for programs written in the language Timed Commands. A program 
in their language gives rise to a timed automaton, and type-checked programs 
adhere to a non-interference like security property. However, their approach is 
limited only to automata that can be described by their language and they do 
not consider information release. 


2 Timed Automata 


A timed automaton [1,5] TA = (Q,E,1,q.) consists of a set of nodes Q, a set of 
annotated edges E, and a labelling function | on nodes. A node qo € Q will be 
the initial node and the mapping | maps each node in Q to a condition (to be 
introduced below) that will be imposed as an invariant at the node. 

The edges are annotated with actions and take the form (qs, g > £ :=a: r, q) 
where qs € Q is the source node and q € Q is the target node. The action 
g— x :=a: r consists of a guard g that has to be satisfied in order for the multiple 
assignments x := a to be performed and the clock variables r to be reset. We shall 
assume that the sequences x and a of program variables and expressions, respec- 
tively, have the same length and that x does not contain any repetitions. To cater 
for special cases we shall allow to write skip for the assignments of g > x :=a: r 
when g (and hence a) is empty; also we shall allow to omit the guard g when it 
equals tt and to omit the clock resets when r is empty. 

It has already emerged that we distinguish between (program) variables x 
and clock variables (or simply clocks) r. The arithmetic expressions a, guards g 
and conditions c are defined as follows using boolean tests b: 


a = G1 OP, G2 | xin 

b ::= tt | ff | a1 op, a2 | =b | by A be 

g =b | ropen | (P1 —r2)op.n | 91 Age 
c ::= b | ropan | (ri — r2) opan | ci A ce 


The arithmetic operators op, and the relational operators op, are as usual. For 
comparisons of clocks we use the operators op, € {<, <, =, >, >} in guards and 
the less permissive set of operators opg € {<, <, =} in conditions. 

To specify the semantics of timed automata let g be a state mapping vari- 
ables to values (which we take to be integers) and let 6 be a clock assignment 
mapping clocks to non-negative reals. We then have total semantic functions [-] 
for evaluating the arithmetic expressions, boolean tests, guards and conditions; 
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the values of the arithmetic expressions and boolean expressions only depend 
on the states whereas that of guards and conditions also depend on the clock 
assignments. 

The configurations of the timed automata have the form (q,a,6) € Config 
where [I(q)](o,6) is true, and the transitions are described by an initial delay 
(possibly none) that increases the values of all the clocks followed by an action. 
Therefore, whenever (qs,g > x:=a: r, q+) is in E we have the rule: 


d>0 
P [I(gs)](o, 6 + d) = tt, 
(ds, G, ô) =~ (qt, a’, ô’) lg] (0, ôF d) = tt, 
o = ojx [alo], ð = (8 + d)[r — 0], 
[!(ae)](o", 0") = tt 


where d corresponds to the initial delay. The rule ensures that after the initial 
delay the invariant and the guard are satisfied in the starting configuration and 
updates the mappings o and ô where 6 + d abbreviates Ar. ô(r) + d. Finally, 
it ensures that the invariant is satisfied in the resulting configuration. Initial 
configurations assume that all clocks are initialized to 0 and have the form 
(qo, 0, Ar.0). 


Traces. We define a trace from (qs,0,6) to q: in a timed automaton TA to have 
one of three forms. It may be a finite “successful” sequence 

(45, 0,8) = (90:00, 50) “4 +> % (ahs ons dn) (n> 0) 

such that {n} = {i]qG=a@ A0<i<n}. 


in which case at least one step is performed. It may be a finite “unsuccessful” 
sequence 


d dn 
(qs, 0, ô) = (q0, T0, 50) So (dn Tn Ôn) (n > 0) 
such that (¢/,,0/,,6/,) is stuck and q ¢ {qj,--- ,@,} 


where (q/,,0/,,6/,) is stuck when there is no action starting from (q,,0/,,0/,). 


nI n nI“ n 


Finally, it may be an infinite “unsuccessful” sequence 


dn 
(qs,0, 8) = (qh, 04, 55) 2+.» Ss (dy 0h, 0) TH... 


such that q: Z {q,a h 


We shall write [TA : qs > qt] (0, ô) for the set of traces from (qs, 0,8} to qe. We 
then have the following proposition 


Proposition 1 [15]. For a pair (0,6) whenever [TA : qs œ q] (0,ô) contains 
only successful traces, then there exists a trace t € |TA : qs œ qu] (0,ô) with 
maximal length. 


We also define the delay of a trace t from (qs, 0, ô) to q and we have that if t is 
a successful trace 


d dn 
(ds, 0, ô) Ji (qo; Co, Ôo) SS (dns On Ôn) = (at, 0’, 6’) 
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price 


data,request Invariants: 


- - releaseno 1]T < 720 
midnight bill, analytics 2T = 720 
releaseyes 3|T = 720 
4|T = 720 
datapm price,analytics 
dataan: t < 12Ar= 1 —> e,:=e;, + ed: r releaseno: s = 0 — skip: 
datapm: t>12At<24Ar=1— e2:=e2 + ed: r releaseyes: s = 1 > yi, y2:=Cc1, C2: 
midnight: t=24A^Ar=l e2:=e2 + ed: r,t price,analytics: pı, p2, a, f:=v1, v2, Z, 1 
data,request: T = 720 — s, c1, C2:=1, e1, €2: price: P1, p2, a, f:=v1, v2, 0, 1 
data: T = 720 — s, c1, C2:=0, e1, €2: billanalytics: f = 1 — b,x, e1, eo, f:= 
Pi * C1 + p2 * Co, 
a,0,0,0: T,t,r 


Fig. 1. The timed automaton SG (and the abbreviations used). 


then 
A(t) =X; di 


In the case of t being an unsuccessful (finite or infinite) trace we have that 
A(t) = œ 


Finally for (01,61), (72,62) whenever for all tı € [TA : qs œ G](o1,01) and 
to € [TA: qs => q| (02,82) we have that A(t,) = A(t2), we will say that (01,61) 
and (2, ô2) have the same termination behaviour with respect to qs and qt. Note 
that it is not necessarily the case that a pair (o, ô) has the same termination 
behaviour as itself. 


Example 1. To illustrate our development we shall consider an example automa- 
ton of a smart grid system as the one described in Sect. 1. The timed automaton 
SG is given in Fig. 1 and it uses the clocks t and T to model the time elapse of a 
day and a month respectively. Between midnight and noon, the electricity data 
ed is aggregated in the variable e1, while from noon to midnight the measure- 
ments are saved in the variable e2. The clock r is used to regulate the frequency 
of the measurements, by allowing one measurement every full hour. At the end 
of a day (midnight) the last measurement is calculated and the clock t is being 
reset to 0 indicating the start of a new day. At the end of each month (T = 720) 
the trusted party TTP collects the data e; and e2 of the meter and stores it in 
the collectors cı and cz respectively. At the same time, the customer C' sends a 
service request s to TTP in case he desires to get some analytics regarding his 
energy consumption. The TTP then requests from the UC the prices pi, p2 of 
the electricity tariffs for the two time periods of interest and in case that C has 
made a request for his data to be analysed (s = 1 otherwise s = 0), TTP also 
reveals the collected data cı and c2 to the UC where the latter stores them in 
the variables yı and y2 respectively. The UC then responds back to the TTP by 
sending the values vı and v2 of the electricity tariffs and also the result z of C’s 
data analytics in case C made a request for that, otherwise it sends the value 0. 
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Once the TTP receives everything (f = 1) he calculates the bill b for C, sends it 
to him together with the analysis result a (C stores it in x), the clocks and the 
variables of the meter are being reset to 0 and a new month starts. For simplicity 
here we assume that all the calculations done by the TTP and the UC by the 
end of the month are being completed in zero time. 


3 Information Flow 


We envisage that there is a security lattice expressing the permissible flows [10]. 
Formally this is a complete lattice and the permitted flows go in the direction 
of the partial order. In our development, it will contain just two elements, L 
(for low) and H (for high), and we set L E H so that only the flow from H to 
L is disallowed. For confidentiality, one would take L to mean public and H to 
mean private and for integrity one would take L to mean trusted and H to mean 
dubious. 

A security policy is then expressed by a mapping £ that assigns an element 
of the security lattice to each program variable and clock variable. An entity is 
called high if it is mapped to H by £, and it is said to be low if it is mapped to 
L by £. To express adherence to the security policy we use the binary operation 
~> defined on sets x and y’ (of variables and clocks): 


yoy evuey: Vu €x : L(u) E Lu’) 


This expresses that all the entities of x may flow into those of x’; note that 
if one of the entities of x has a high security level then it must be the case that 
all the entities of x’ have high security level. 


Example 2. Returning to Example 1 of our smart grid system, we have that £ 
maps the program variable ed of the electricity data, the variables e1,e2 that 
store this data, the collectors c,,c2 and the bill b to the security level H, while 
the rest of the program variables and clocks are mapped to L. 


Information flow control enforces a security policy by imposing constraints of the 
form {y} ~~ {a} whenever the value of y may somehow influence (or flow into) 
that of x. Traditionally we distinguish between explicit and implicit flows as 
explained below. As an example of an explicit flow consider a simple assignment 
of the form x:=a. This gives rise to a condition fv(a) ~ {x} so as to indicate 
that the explicit flow from the variables of a to the variable x must adhere to 
the security policy: if a contains a variable with high security level then x also 
must have high security level. For an example of an implicit flow consider a 
conditional assignment g — x:=0 where x is assigned the constant value 0 in 
case g evaluates to true. This gives rise to a condition fv(g) ~ {x} so as to 
indicate that the implicit flow from the variables of g to the variable x must 
adhere to the security policy: if g contains a variable with high security level 
then x also must have high security level. 

As has already been explained, many applications as our smart grid example 
inevitably leak some information. In this paper we develop an approach to ensure 
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that the security policy is adhered to by the timed automaton of interest, however 
in certain conditions it can be bypassed. Thus, for a timed automaton TA = 
(Q,E,1,q.), we shall assume that there exists a set of observable nodes Y C Q, 
that are the nodes where the values of program variables and clocks with low 
security are observable by an attacker. The observable nodes will be described 
by the union of two disjoint sets Y, and Yw, where a node q in Y, (Yw resp.) will 
be called strongly observable (weakly observable resp.). The key idea is to ensure 
that {x} ~~ {y} whenever there is an explicit flow of information from z to y (as 
illustrated above) or an implicit flow from x to y in computations that lead to 
strongly observable nodes, while computations that lead to weakly observable 
nodes are allowed to bypass the security policy £. 

To overcome the vagueness of this explanation we need to define a semantic 
condition that encompasses our notion of permissible information flow, where 
information leakage occurs only at specific places in our automaton. 


Observable Steps. Since the values of low program variables and clocks are only 
observable at the nodes in Y, we collapse the transitions of the automaton that 
lead to non-observable nodes into one. Thus we have an observable successful 
step 
D 
(qs, T, ô) =>yY (qt, a’, 0’) 
whenever there exists a successful trace t 


dn 


(qs;0, 8) = (qo, co, do) > ++» “8 (qn, on, Ôn) = (q, 8) (n>0) 


from (qs,0,0) to q in TA and q € Y, D = A(t) and Vi € {1,...,n—1}:q ZY. 
And we have an observable unsuccessful trace 


(qs, T, ô) Sy ui 


whenever there exists an unsuccessful finite trace 


dı 


(ds, 0, 6) = (do, oo, 50) Ss r, (dn, On, Ôn) (n 2 0) 


or an unsuccessful infinite trace 


dy, 
(9850, ô) = (qo, o0, Ôo) ay aii Kin (dn, On, Ôn) 25 A 
from (qs,0,ô) to any of the nodes in Y and Vi > 0: qi Y. From now on it 
should be clear that a configuration y will range over Config U {L}. 
We write (0,5) = (0’,6’) to indicate that the two pairs are equal on low 
variables and low clocks: 


(0,6) =(0',60') if Va: L(x) =L > o(x)=0'(x) A 
Vr: L(r) = L => d(r) =ð (r) 


It is immediate that this definition of = gives rise to an equivalence relation. 
Intuitively = represents the view of a passive attacker as defined in [24], a prin- 
cipal that is able to observe the computations of a timed automaton and deduce 
information. 
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We will now define our security notion with the use of a bisimulation relation. 
Our notion shares some ideas from [19,21], where a bisimulation-based security 
is defined for a programming language with threads. In their approach, the 
bypassing of the security policy is localized on the actions, and that is because 
their attacker model is able to observe the low variables of a program at any of 
its computation steps (e.g. in a timed-automaton all of the nodes would have 
been observable). In contrast to [19,21], we localize bypassing of policies at the 
level of the nodes, while we also define a more flexible notion of security with 
respect to the attacker’s observability. 


Definition 1 (Y—Bisimulation). For a timed automaton TA = (Q,E,|,q.) and 
a set of nodes Y = Y, U Yu, a relation ~yC (ConfigU {L}) x (Config U {L}) 
will be called a Y—bisimulation relation if ~y is symmetric and we have that if 
yı = (1,01, 51) Sy (q2, 02, 62) = Y2 then 


l D _ 
(01,61) = (02, 62) > if 1 =y q4 then 3%, Do: 
D 
2 Sy Y AY XY VA 
(1, ALA AL) = ((node(71) € Yu A node(y3) € Yw) V 
pair(y,) = pair(y3)) 


where node((q,0, 6)) =q, pair((q, oO; 6)) E (o, ô), and if Yı XY V2 then 
(j= e% = 1) 


We write ~y for the union of all the Y—bisimulations and it is immediate that 
this definition of ~y is both a Y—bisimulation and an equivalence relation. Intu- 
itively, when two configurations are related in ~y, and they are low equivalent 
then they produce distinguishable pairs of states only at the weakly observable 
nodes. Otherwise, observations made at strongly observable nodes should be still 
indistinguishable. In both cases, the resulting configurations of two Y —bisimilar 
configurations should also be Y—bisimilar. We are now ready to define our secu- 
rity notion. 


Definition 2 (Security of Timed Automata). For a timed automaton TA = 
(Q,E,l,qo) and a set Y = Y, U Yu of observable nodes, we will say that TA 
satisfies the information security policy £L whenever: 


Vq € {qo} UY : V(a, ô), (0, 0’): 
MDa) A aod) = (4,0,0) ~y (g,0", 6") 


Whenever Y,, = Ú our notion of security coincides with standard definitions of 
non-interference [29], where an automaton that satisfies the information security 
policy £ does not leak any information about its high variables. 


Example 8. For the smart grid automaton SG of the Example 1, we have the set 
of observable nodes Y = {2,3,4}, where the strongly observable ones are the 
nodes 2 and 4 (Y, = {2,4}), and the weakly one is the node 3 (Yw = {3}), where 
the TTP is allowed to release the secret information of C. 
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4 Post-dominators 


For the implicit flows arising from conditions, we are interested in finding their 
end points (nodes) that are the points where the control flow is not dependent 
on the conditions anymore. For that, we define a generalized version of the post- 
dominator relation and the immediate post-dominator relation [18]. 


Paths. A path x in a timed automaton TA = (Q,E,l,qə) is a finite 7 = 
goact1q1.--Gn—14Ctngn (n > 0) or infinite m = qoactiq1..-Gn—1aCtn gn... sequence 
of nodes and actions such that Vi > 0: (qi-1, acti, qi) € E. We say that a path is 
trivial if 7 = qo and we say that a node q belongs to the path 7, or m contains q, 
and we will write q € 7, if there exists some i such that q; = q. For a finite path 
T = qoactıqı.--qn—14Ctnqn we write m(i) = qracti41Gi41---Qn—14Ctngn (i < n) for 
the suffix of m that starts at the i-th position and we usually refer to it as the 
i-th suffix of r. Finally, for a node q and a set of nodes Y C Q we write 


Hgv) = {7 | T = qoactiqi...dn—10Cingn : n >OAgGa=GqAgnE YA 
Vie {1,..,.n-1}:q¢Y} 
for the set of all the non-trivial finite paths that start at q, end at a node y in 
Y and all the intermediate nodes of the path do not belong in Y. 
Definition 3 (Post-dominators). For a node q and a set of nodes Y C Q we 
define the set 
pdomy (q) = {q | Yr € Hqy) : a’ € 7(1)} 
and whenever q' € pdomy (q), we will say that q' is a Y post-dominator of q. 
Intuitively whenever a node q’ is a Y post-dominator of a node q it means that 
every non-trivial path that starts at q has to visit q’ before it visits one of the 


nodes in Y. We write pdom,(q) whenever Y = {y} is a singleton and we have 
the following facts 


Fact 1. For a set of nodes Y C Q and for a node q we have that 
pdomy (q) = N pdom, (q) 
yeY 


Fact 2. The post-dominator set for a singleton set {y} can be computed by find- 
ing the greatest solution of the following data-flow equations: 


pdomy(q) =Q if Mat = 9 
pdomy(q) = {y} if y € succ(q) 
pdomy (9) = ()qresuce(a) ({q'} U pdom, (q')) otherwise 


For a node q, we are interested in finding the Y post-dominator “closest” to it. 


Definition 4. For a node q and a set of nodes Y we definite the set 


ipdomy (q) = {q' € pdomy (q) | pdomy (q) = {q'}V 
qd ZY N(Y" € pdomy(q):q" £ d => 
q” € pdomy (q'))} 


and a node q' € ipdomy (q) will be called an immediate Y post-dominator of q. 
Y 
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The following fact gives us a unique immediate Y post-dominator for the nodes 
that can reach Y (Igy) # Q). Intuitively this unique immediate Y post- 
dominator of a node q is the node that is the “closest” Y post-dominator of 
q, meaning that in any non-trivial path starting from q and ending in Y, the Y 
immediate post-dominator of q will always be visited first before any other Y 
post-dominator of q. 


Fact 3. For a set of nodes Y and a node q, whenever Hay) # O and 
pdomy (q)# 0 then there exists node q' such that ipdomy (q) = {q}. 


For simplicity, whenever a node q’ is the unique immediate Y post-dominator of 
a node q and I(q,y) # Ù we shall write ipdy (q) for q' and we will say that the 
unique immediate Y post-dominator of q is defined. For any other case where 
q can either not reach Y (I7(g,y) = 0) or pdomy (q) = Ø we will say that the 
unique immediate post-dominator of q is not defined. 


Example 4. For the timed automaton SG and for the set of observable nodes 
Y = {2,3,4}, we have that pdom,-(q) = ipdy (q) = {2} for q being 1, 3 and 4 
while pdomy (2) = ipdy (2) = Ø. Therefore for the nodes 1,3 and 4 their unique 
immediate Y post-dominator is defined and it is the node 2, while the unique 
immediate Y post-dominator of the node 2 is not defined. 


5 Algorithm for Secure Information Flow 


We develop an algorithm (Fig. 2) that traverses the graph of a timed automa- 
ton TA = (Q, E,l,qo) and imposes information flow constraints on the program 
variables and clocks of the automaton with respect to a security policy £ and 
a Y post-dominator relation, where Y = Y, U Yy is the set of observable nodes. 
Before we explain the algorithm we start by defining some auxiliary operators. 


Auailiary Operators. For an edge (qs,g > x :=a: r,q:) € E we define the auxil- 
iary operator ass(.), expr(.) and con(.) as 


ass((q5,9 > =a: r,q)) = {xr} 
expr((qs,g > @:=a: r, q+)) = {a} 
con((qs,g > &©:=a: r, qr)) = Kgs) Ag Al(qr)[a/ax][0/r] 


where ass(.) gives the modified variables and clocks of the assignment performed 
by TA using that edge, expr(.) gives the expressions used for the assignment, 
and the operator con(.) returns the condition that has to hold in order for the 
assignment to be performed. We finally lift the ass(.) operator to finite paths and 
thus for a finite path 7 = qoact1q1...dn—14Ctndn we define the auxiliary operators 
Ass(.) as 


Ass(qoact1 q1---Gn—14Ctndn) = U; ass((qi-1, acti, qi)) 


We write 
Quw = {q | Yr = q.q € Hy) : d € Yu} 
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C1. For all q E€ Quu: 

a) Nece, Vy € fv(con(e)) : L(y) = LA (We V Ae) 

C2. For all q € Q&.,, such that their unique immediate Y post-dominator is defined : 
a) Nect, retetei fv(con(e)) ~ Ass(T) A Ae 

b) Neżet:ecEg, e'€Eq, sat(con(e)Acon(e’)) fv(con(e)) eas U 
c) P > Nece, Vy € fv(con(e)) : L(y) = L 


Ass(1) 


TEM (ce! fipdy (a)}) 


C3. For all q E€ Q&.,, such that their unique immediate Y post-dominator is not defined : 
a) Neee, Yy € fv(con(e)) : L(y) = LA 
b) ((e ~ wA Pe) V Ae) 


Fig. 2. Security of TA = (Q,E,l,qo) with respect to £ and the Y post-dominator 
relation 


for the set of nodes, where whenever the automaton performs a successful observ- 
able step starting from a node q E€ Q..y and ending in an observable node q’ € Y, 
then it is always the case that q’ is weakly observable. 


Condition C1. We start by looking at the nodes in QU. According to our 
security notion (Definition 2), for two low equivalent configurations at a node 
q, whenever the first one performs a successful (or unsuccessful) observable step 
that ends at a weakly observable node, then also the second should be able 
to perform an observable step that ends at a weakly observable node (or an 
unsuccessful one resp.). For that, the condition C1 (a) first requires that the 
conditions of the outgoing edges in Ey where E, = {(q, act, q’) | (q, act,q’) € E} 
contain only low variables. However, this is not enough. 

To explain the rest of the constraints imposed by oe OS eBEAP? 
the condition C1 (a) consider the automaton (a) of 
Fig. 3, where the node 3 is weakly observable, h and i? 
l is a high and a low variable respectively, and all the 
invariants of the nodes are set to tt. This automaton 
is not secure with respect to Definition 2. To see this, 
we have ([I + 0,h © 1,6) = ([I = 0,h & OI, 6) G) 
(for some clock state 6) but the pair ([I => 0,h —> 
1], ô) always produces L since we will have an infinite 
loop at the node 2, while ([I + 0,h + 0], ô) always 
terminates at the node 3. That is because even if both 
edges of the node 2 contain only the low variable | 


in their condition, the assignment l:=h bypasses the Peja 
policy £ and thus, right after it, the two pairs stop ©% : 


l <0 > skip: 


L=h 


= 


being low equivalent. 


As another example, consider the automaton (b) U>0 > skip: 
of Fig. 3. Here the node 4 is weakly observable, h is 
a high variable, |, I/ are two low variables and all Fig. 3. Example automata 


the invariants of nodes are set to tt again. We have (a) (top) and (b) (bottom) 
([I = 0, O,h = 1,5) = (|I = 0, =œ 0,h & O],5) (for some clock state ô) 
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and again the first pair produces L by looping at the node 3, whereas the second 
pair always terminates. Here even if the variable | is not used in any condition 
after the assignment |:=h, it influences the value of l’ and consequently, since I’ 
appears on the condition of the edges of the node 3 we get this behavior. 

To cater for such cases, for an edge e = (q,,g > ©:=a: r, qi) we first define 
the predicate 


Ae = Nilai) ~ {ai} 
that takes care of the explicit flows arising from the assignments. We then define 


Tey) = {r | e = (qo, acti, qi) : T = qoactigi...Gn—14Ctn gn E Tea. vy} 


to be set of paths (the ones defined in Sect. 4) that start with e and end in Y, and 
all the intermediate nodes do not belong to Y. Finally, whenever an assignment 
bypasses the security policy £ due to an explicit flow and thus A, is false, we 
then impose the predicate 


Ve = Yr € Wey): Yq € m(1): 
qd ZY = (Ve € Ey: (ass(e) \ R) A (fv(con(e')) U fu(expr(e’))) = 0) 


The predicate We demands that the assigned program variables of e = 
(qs, act,q:) cannot be used in any expression or condition that appears in a 
path that starts with q and goes to an observable node. Note here that even if 
We quantifies over a possibly infinite set of paths (Me y)), it can be computed in 
finite time by only looking at the paths where each cycle occurs at most once. 

We will now look at the nodes where the automaton may perform a success- 
ful observable step that ends in a strongly observable node. Those nodes are 
described by the set QS, = Q \ Quw, that is the complement of Quy. 


Condition C2. For a node q in QS,„, whose immediate Y post-dominator is 
defined, condition C2 (a) takes care of the explicit and the implicit flows gener- 
ated by the assignment and the control dependencies respectively, arising from 
the edges of q. Note here that we do not propagate the implicit flows any further 
after ipdy (q). This is because ipdy (q) is the point where all the branches of q are 
joining and any further computation is not control-dependent on them anymore. 
Those constraints are along the line of Denning’s approach [10] of the so-called 
block-labels. 

To understand condition C2 (b) consider the automaton h>0— skip: 
(c) of Fig. 4, where h and | is a high and a low variable respec- cr O 
tively, the node 2 is strongly observable, and both nodes 1 = 
and 2 have their invariant set to tt. Next take ([I = 0,h => 
1],6) = ([l + 0,h + OJ,d) (for some clock state 6) and Fig. 4. Example 
note that the first pair can result in a configuration in 2 with automaton (c) 
({1 + 0,h + 1],6) (taking the top branch) while the second 
pair always ends in 2 with [l + 1, h + 0]. Therefore this automaton is not secure 
with respect to our Definition 2. 
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To take care of such behaviours we write sat(---) to express the satisfiability 
of the --- formula. Whenever there are two branches (induced by the edges e and 
e’ both leaving q) that are not mutually exclusive (that is, where sat(con(e) A 
con(e’)) we make sure to record the information flow arising from bypassing the 
branch that would otherwise perform an assignment. This is essential for dealing 
with non-determinism. 


Fact 4. For a timed automaton TA = (Q,E,|,qo), we have that if 
(q, 0,6) >a (150.5) 
then 
{z | o(x) Ao'(a)}U{r | I(r) ZH(r)+D¥ SC [J Ass(r) 


TET e, {4'}) 
where e corresponds to the initial edge of this observable step. 


Condition C2 (c) takes care of cases where a timing/termination side channel 
[2] could have occurred. 

As an example of such a case consider the automaton (d) > 0At>30— skip: 
of Fig. 5, where h and t is a high program variable and a low Cr OO 
clock respectively, node 2 is strongly observable and both 1 hee ae 
and 2 have their invariant set to tt. Next, for ([h > 1], [t — 
0]) = ([h & 0], [t + 0]) we have that the first pair always 
delays at least 30 units and ends in 2 with a clock state that 
has t > 30, whereas the second pair can go to 2, taking the 
lower branch immediately without any delay, and thus the resulting pairs will 
not be low equivalent. To take care of such behaviours, we stipulate a predicate 
B; such that 


Jti, t2 E Uio,5)-11¢q)] (0,8) LTA : 4 > ipdy (q)] (a, 6) : A(ta) # Alta) 


Fig. 5. Example 
automaton (d) 


4 


Py 


Using this predicate we demand that whenever the TA does not have a “constant” 
termination behavior from the node q to the node ipdy (q), then variables that 
influence the termination behavior should not be of high security level. 


Condition C3. We are now left with the nodes in Q¢,,,, whose immediate Y 
post-dominator is not defined. Since for such a node q, we cannot find a point 
(the unique immediate Y post-dominator) where the control dependencies from 
the branches of q end, condition C3 (a) requires that the conditions of the edges 
of q should not be dependent on high security variables. 

Condition C3 (b) caters for the explicit flows, of an edge e using the predicate 
Ae. However we are allowed to dispense Ae, whenever further computations after 
taking the edge e may lead only to weakly observable nodes and We holds. To 
express this for an edge e = (qs,g > ©:=@: T,q@) we write 


Carw 


whenever qi € Yu or qi € Quy. 
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Example 5. Consider now the automaton SG of Example 1, and the Y post- 
dominator relation of Example 4. 

We have that the nodes 1, 3 and 4 are in Q$,„ and also that their immediate 
unique Y post-dominator is defined. Condition C2 (a) and C2 (b) impose the 
following constraints 


{T,t, r} r {ed, ei, ci, S, t, r}, {ed, ei} k {ei}, fei} bi {ci}, {vi} ee {pi} (i = 1,2) 
{T} ~> {Pi, P2,a, f}, {z} ~ {a}, {} ~ {s,f, at 


Finally, for the node 1, because ı (C2 (c)) all the clocks need to be of low 
security level. 

Next, the node 2 is in Q¢,,,, and since its unique immediate Y post-dominator 
is not defined, condition C3 (b) impose the constraints 


{P1; P2; c1,€2} ~> {b}, {a} ~> {x}; {} ~> fer, e2, f} 


and condition C3 (a) imposes that T, s and f should be of low security level. 
Notice here that since for the edge e = (2, s = 1 > y1, Y2:=C1, C2: , 3) that releases 
the sensitive information of C we have that e ~> w, we are not imposing the 
constraint {ci} ~ {yi} (i = (1,2)). Those constraints are easy to verify for the 
security assignment of Example 2. 

Now if we were to change the node 3 from being a weakly observable to a 
strongly observable node, the automaton SG will not be secure with respect to 
Definition 2. In that case our algorithm will reject it, since for the edge e we 
would have that e 7% w and the predicate Ae would have resulted in false. 


Finally, we shall write secy,c(TA) whenever the constraints arising from our 
algorithm (Fig. 2) are satisfied and thus we have the following lemmas 


Lemma 1. For a timed automaton TA = (Q,E,l,qə), if secy.c(TA) then for 
(01,61), (02,62) such that [I(¢q)](o1, 61) and [I(q)] (a2, 52) and (01,61) = (a2, 62) 
we have that 


if (q,01,61) Sy (d',o, 81) then 3(o5, 65), D2 : (q, 02,52) Sy (d',o, 54) 
(q! € Ya V (04,54) = (05, 6) 


Lemma 2. For a timed automaton TA = (Q,E,l,qo), if secy,c(TA) then for 
(01,51), (02,62) such that [I(¢)](o1, 61) and [I(q)](c2, 52) and (01,61) = (02, 62) 


we have that 
if (q,01,61) =y L then also (q, 02,62) Sy L 


The following theorem concludes the two lemmas from above to establish 
the soundness of our algorithm with respect to the notion of security of 
Definition 2. 


Theorem 1. For a timed automaton TA = (Q,E,l,qo), if secy,c(TA) then TA 
satisfies the information security policy L. 
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6 Conclusion 


We have shown how to successfully enforce Information Flow Control policies 
on timed automata. This has facilitated developing an algorithm that prevents 
unnecessary label creep and that deals with non-determinism, non-termination, 
and continuous real-time. The algorithm has been proved sound by means of a 
bisimulation result, that allows controlled information leakage. 

We are exploring how to automate the analysis and in particular how to 
implement (a sound approximation of) the &, predicate. There has been a lot of 
research [3,4] done for determining the maximum (maz) or minimum (min) 
execution time that an automaton needs to move from a location q, to a location 
qı. One possibility is to make use of this work [3,4] and thus the predicate , 
would amount to checking if the execution time between the two nodes of interest 
(q and ipdy(q)) is constant (e.g. max, = mins). 

A longer-term goal is to allow policies to simultaneously deal with safety and 
security properties of cyberphysical systems. 


Appendix 


Proposition 1 


Assume that all the traces in [TA : qs > q| (0, ô) are successful and we want to 
show that there exists t € [TA : qs > q] (0,6) with a maximal length m. 

We use results from model-checking for timed automata [15]. As in [15] we 
first transform our automaton to an equivalent diagonal-free automaton, that 
is an automaton where clocks appearing in its guards and invariants can be 
compared only to integers (e.g. rı — r2 < 5 is not allowed). We then define the 
region graph RG(TA) of TA, that is a finite graph where nodes of the region graph 
are of the form (q,reg) where reg is a clock region, that is an equivalence class 
defined on the clock states (for details we refer to [15]). Configurations of RG(TA) 
are of the form ((q,reg),a) and we have that ((q, reg), o) => ((q’, reg’), a’) if 
there are 6 € reg, 6’ € reg’, d > 0, o’ such that the automaton TA performs the 
transition (q, 7, 0) = (q',o', 6’). Lemma 1 of [15] then states that each abstract 
run (finite or infinite) in the region graph RG(TA) can be instantiated by a run 
(finite or infinite resp.) in TA and vice verca. This is based on the property of 
the region graph of being pre-stable that is that ((q, reg), o) => ((q’, reg’), a’) if 
Vo € reg there are 0’ € reg’, d > 0, o’ such that (q, 0,6) as (q’,o', 6’). Therefore 
the computation tree T of (q, o, ô) in TA has the same depth as the computation 
tree T’ of ((q,[6]),a) in RG(TA) where [ô] is the region that contains all the 
clock states that are equivalent to 6. We then recall Konig’s infinity lemma as it 
applies to trees — that every tree who has infinitely-many vertices but is locally 
finite (each vertex has finitely-many successor vertices), has at least one infinite 
path [11]. It is immediate that T” is a locally finite tree. Now if T’ is infinite 
then by König’s infinity lemma we have that T’ has an infinite path and thus 
using Lemma 1 of [15] we have also that T has an infinite path that corresponds 
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to a trace (qg,0,6) in TA which contradicts our assumptions that all the traces 
of (q,a, ô) are finite. Therefore we can conclude that T’ has a finite depth and 
therefore also T and that they are equal to the number m. 


Proof of Fact 2 


Proof. The first equation is straightforward by the definition of the post- 
dominator relation. For the second one, that is when y is a successor (an immedi- 
ate one) of q then the only post-dominators of q is the node y, since there exists 
a non-trivial path m = qacty € Iq.) (for some action act) such that the trivial 
path 7(1) = y contains only y, and therefore for any other path a’ € Megy) 
in which a node q’ different from y is contained in 7’(1), q’ can not be a post- 
dominator of q since it is not contained in the trivial path 7(1). To understand 
the last equation notice that if a node q” post-dominates all of the successors of 
q or it is a successor of q that post-dominates all the other successors of q then 
all the non-trivial paths from q to y will always visit q” and thus q” € pdom,(q); 
similarly if q” Z Ngresuce(q) ({q'} U pdom, (q’)) then there exists a successor of 
q, q Æ q” such that q” does not post-dominate q’ and thus we can find a non- 
trivial path m € Hq,y) that starts with qactq' (for some action act) and does 
not contain q” and thus q” is not a post-dominator of q. 


Proof of Fact 3 


Proof. To prove that ipdomy (q) is singleton we consider two cases. In the case 
that pdomy(q) = {q} then the proof is trivial. 

Assume now that pdomy (q) = {q@1,.-.;dn} (n > 2) and take an arbitrary 
non-trivial path 7 € IT(g,y) and find the closest to q (the one that appears first 
in the path) Y post-dominator q; E€ pdomy(q) in that path. Next note that 
qj Z Y since if q; € Y, we could shorten that path to the point that we meet q; 
for the first time and thus we have found a non trivial path 7’ € I(q,y) (since 
qj € Y) in which Vi Æ j : qi ¢ m'(l) and thus Vi Æ j : qi ¢ pdomy (q) which 
contradicts our assumption. Next to prove that Vi Æ j : qi € pdomy (qj) assume 
that this is not the case and thus we can find qı # q; : qı ¢ pdomy (qz). Therefore 
we can find a path 7” € I(g, y) such that q  T”(1), but this means that if 
we concatenate the paths 7’ and 2” we have a path in I7(g,y) in which q does 
not belong to it and thus q does not belong in its 1-suffix either and therefore 
qı ¢ pdomy (q), which again contradicts our assumption. 

Finally to prove that ipdomy (q) is singleton assume that there exists another 
Y post-dominator of q, q such that qı # q; and qı ¢ Y and qj € pdom(q). 
Then this means that q; belongs in all the 1-suffixes of the paths in the set 
IT (qv). Therefore take 7 = q...qj--y E€ Miq,y) (for some y € Y) such that 7 
contains no cycles (e.g. each node occurs exactly once in the path) but then there 
exists a path a’ = q;...y (the suffix of the path 7) such that qı ¢ 7’ and thus 
qı ¢ pdomy(q;) which contradicts our assumption. Therefore we have proved 
that q; is the unique immediate Y post-dominator of q. 
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Proof of Lemma 1 
Proof. Assume that (q, 01,61) Di (q', 01,01) because of the trace 


d dr 
(q, 01,81) = (q, c01, 601) Sai (dki; Cki, Ôk1) = (q, 0101)  (*) 
where k > 0 and Vi € {1,. k — 1} : qu ¢ Y and Dı = X}_ dj and the first 
transition of the trace has happened because of the edge e € Eg. 
We shall consider two main cases. The one where q is in Qw and one where 
it is not. 


Main Case 1: q is in Quy. In that case q’ € Y,, and thus we only have to prove 
that (72,62) can reach q’. We start by proving a small fact. 
First for a set of variables and clocks Z, and two pairs (a, ô), (a’, 6’) we write 


(o,6) =* (0,6) iff Va: (@€ ZAL(x) = L) > o(x) =0'(z) A 
Yr: (r € ZA L(r) =L) => d(r) = 6(r) 


Next, for a finite path m = qoact1q1...dn—1actngn we define the auxiliary 
operator Z(.) as Z(t) = Uo (User, fv(con(e’)) U fy(expr(e'))). 
Now we will prove that for a path T= qo act diy. Un—1)198tnIn € Mey), if 
di di 
(q, 01,81) = (qor: Cor: 501) > = — (a00) (US 2) (1) 


using the edges (dor: act1, di1); = (da1) At, qi) and (01,61) =2(7) (02, 62) 
then F(a}, ôro): 


di di 
(q, 02, 82) = (401,702,502) —> + —> (dia Flas 512) (a) 
and 
l< n> (011,61) =O (ota, 512) (b) 
where recall that (1) is the l—suffix of 7. The proof proceeds by induction on l. 
Base Case l = 1. To prove (a), let e = (qġ1,9g > £:=a: r, qı) and note that 
because (01,61) =*') (02,62) and con(e) contains only low variables (since qh, = 


q E Quy and C1 (a)) it is immediate that there exists oj}, = o2[@ > [aloo], 
d42 = (2 +d; )[r + 0] such that [(q1)] (02, 82 +41) = tt and [I(q11)] (42, 912) = 


tt, and (dhr, o2, 52) > (dir, Phas 542). 

Now if l < n, to prove (b) we consider two cases. One where Ae is true and one 
where it is false. If Ae is true we note that (011, 811) =*"™ (l3, 642), and then 
it is immediate that also (041, i1) =2'™) (o49, 542) as required. Otherwise, 
if Ae is false then Ye is true and thus (014,811) =2°™) (ola, 645), because 
the two pairs are still low equivalent for the variables that are not used in the 
assignment of e, while the ones used in the assignment of e they do not appear 
in any condition (or expression) of an edge of a node q that belongs in (1). 


Inductive Case | = lo +1 (lo > 0). Because of the trace in (1) we have that tı = 


di d, dj 
(901: 701: 901) —> (qii; 011; 941) and t2 = (qii; 041; 911) aa ee: (di1 F115 O11)» 
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Using our induction hypothesis on tı we have that there exists (012,012) such 
that (q01, 72, 42) 4, (di1: F125 542) and (041, 54;) =") (012,812) and the 
proof is completed using our induction hypothesis on tg. The proof of Main 
Case 1 follows by the result (a) of the fact from above, taking the path m 
that corresponds to the trace (*) and using that (01,61) =*" (02,82) (since 
(01,61) = (02, 62) and all the nodes in 7 except qx1 have edges whose conditions 
contain only low variables). Therefore, since (o1, 61) creates the trace (*) we also 


have that A(o4, 04) : 


d dn 
(q, 2,62) = (go1; 02; 502) —> -. —> (der, 72; 5k2) = (q’, 75, 55) 


and thus for Də = dı +... + dy we have that 
D 
(q, 02, 59) =y Cie T3, 65) 


where q’ € Yu and this completes the proof for this case. 


Main Case 2: When q is not in Quy. The proof proceeds by induction on the 
length k of the trace (*). 


Base Case (k = 1). We have that 


d 
(q, Ol, 61) > Ce T}, 01) 
and let e = (q¢,g ~ x :=a: r, q'), then it is immediate that Dı = di, o} = cile => 
[a]ou], ĉi = (6, + d,)[r — 0] and [I(q)]}(o1, 6) + dı) = tt and [I(g’)] (01, 81) = tt. 
We shall consider two subcases one where the unique immediate Y post- 
dominator of q is defined and one where it is not. 


Subcase 1: When the unique immediate Y post-dominator ipdy (q) is defined. It 
has to be the case then that q’ = ipdy (q) since q’ € Y and in particular, we have 
that q’ € Y,. We will proceed by considering two other subcases of the Subcase 
1, one where the condition ®, is true and one which it is false. 


Subcase 1(a): When ®, is true. Then it is the case that all the variables of the 
condition con(e) are low and thus it is immediate that there exists d2 = dı and 
o = ozlæ + [aloo], 65 = (62 + d2)[r + O] and [I(q)](o2, 62 + d2) = tt and 
[I(¢’) (05, 65) = tt such that (q, 02, 62) 4, (q',04,65) which implies that for 
Dz = d2 

(4, 02,62) Sy (q',0%,55) 
Finally, because secy,c(TA), condition C2 (a) gives us that Ae is true, and thus 


all the explicit flows arising from the assignments x:=a@ are permissible and 
thus (04,64) = (0, 65) as required. 


Subcase 1(b): When ®, is false. If it is the case that all the variables in the 
condition con(e) are low then the proof proceeds as in Subcase 1(a). 
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For the case now that at least one variable in the condition con(e) is high then 
because secy,c(TA), condition C2 (a) and Fact 4 ensure that Va: L(x) = L => 
o4(x) = o;(x) and Vr: L(r) = L => ĝi (r) = ô (r) + dı. Since ®, is false (o1, 51) 
and (2,62) have the same asians behaviour and thus there exists dz = dı 
and (04,64) such that (q,02, 5) > (q’,05,64) and therefore for Da = dz we 
have that 

(4,02, 62) =y (q',02,62) 


We just showed that (01,81) = (01,61 +d1) = (a2, 62 +d2) and we will now show 
that (04,04) = (a2, 62 + d2). 
Now if 


(q, 02, 52) 2> => (dq, 9, 09) 


using the edge e or an edge e’ Æ e such that con(e’) contains a high variable, since 
secy,c(TA), condition C2 (a) and Fact 4 gives that Vz : L(x) = L > of(ax) = 
o2(x) and Vr : L(r) = L => 64(r) = ôa (r)+d2 and therefore (04, 65) = (02, 62+d2) 
as required. If now con(e’) contains only low variables, (a1, 61) is a witness of 
sat(con(e) A con(e’)) and therefore because secy,c(TA), using the condition C2 
(b) and Fact 4 we work as before and we obtain that (04,05) = (02,42 + d2). 


Subcase 2: When the unique immediate Y post-dominator of q is not defined. In 

that case, all the variables in con(e) are low. If q’ is in Yẹ we have that e ~> w 

and we proceed as in Main Case 1. Otherwise, we proceed as in Subcase 1(a). 
This completes the case for k = 1. 


Inductive Case (k = ko +1). We have that 
dx 
(q,01,51) = (q, 001, 501) Æ> -Æ (ger, oki, 541) = (q', o1, 54) 


and recall that the first transition happened because of the edge e and that q is 
not in Quy. 

We shall consider two cases again, one where the unique immediate Y post- 
dominator of q is defined and one where it is not. 


Subcase 1: When the unique immediate-post dominator ipdy(q) is defined. We 
will proceed by considering two subcases of Subcase 1, one where @, is true and 
one where @, is false. 


Subcase 1(a): When Bq is true. Since ®, is true we have that all the variables in 
con(e) are low and thus dd = dı and (012,612) = (011, 011) (this is ensured by 
our assumptions that secy,c(TA) and the predicate Ae of the condition C2 (a) 
that takes care of the explicit flows arising from the assignment in the edge e) 
such that 


di 
(q, 2,62) = (qo1, C02, 502) —> (q11, 012, 512) (1) 


Since q is not in Quy, note that it is also the case that qi; is not in Quy and 
thus using that (012, 612) = (011,611) and our induction hypothesis on the trace 


d dr 
(q11;, 011,611) —> «» —> (Get; Ok1, Ôk1) 
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we have that 4(o4, 05) and D4 such that 


(qu, 012, 512) Sy (q's T3, 82) (2) 


and therefore by (1) and (2) and for Də = d + Dj we have that 
D 
(q, 02, 62) =y ld, 05, 65) 


and (04,6) = (05,65) V q’ € Yu as required. 


Subcase 1(b): When ®, is false. In the case that all the variables in con(e) are 
low then the proof proceeds as in Subcase 1(a). 

Assume now that at least one variable in con(e) is high. Since ipdy (q) is 
defined then there exists j € {1,...,k} such that q;1 = ipdy (q) and Vi € {1,..,j- 
1} : qi #ipdy (q). Therefore we have that 


d dj dj d 
(901; 701; 501) SS (951,091; O71) ns ee (dk1, 7k1, Ôk1) 


Next, using that secy c (TA), condition C2 (a) and Fact 4 gives us that Vz : 
L(x) = L > cji(£) = co (x) and Vr: L(r) = L => ĝji(r) = o1 (r) + di + + dj. 
Since @, is false, (01, 41) and (a2, 42) have the same termination behaviour and 
thus there exists trace t € [TA : q + ipdy(q)](o2, 62) and d{,...,d) such that 
dı + ... + dj = di + ... + d; and (o72, 512) such that t’ is 


di d 
(q, 72, 62) = (q, 702, 602) — a (q12; 912, 612) (3) 


and qiz = ipdy (q). 

It is immediate that Va : L(x) = L > oj2(%) = o02(#) and Vr: L(r) = L > 
d12(r) = do2(7)+d)+...+d). To see how we obtain this result, we have that if t’ has 
started using the edge e or an edge e’ Æ e, where con(e’) contains at least one high 
variable, then this result follows by our assumptions that secy,c(TA), condition 
C2 (a) and Fact 4. Now if the t’ has started using an edge e’ # e and con(e’) 
contains only low variables then (01, 61) is a witness of sat(con(e) A con(e’)) and 
the result follows by our assumptions that secy c (TA), condition C2 (b) and 
Fact 4. Therefore in any case (01,651) = (012, 412). 

Now if ipdy (q) = qxı the proof has been completed. Otherwise we have that 
ipdy (q) is not in Quy and the proof follows by an induction on the trace 


dj dk 
(dji; T51, Ôj1) —> (dk1; Cki, Ôk1) 


using that (051, 01) = (o12, 612) 


Subcase 2: When the unique immediate Y post-dominator of q is not defined. In 
that case, all the variables in con(e) are low. Therefore, if e ~> w we proceed 
similar to Main Case 1, otherwise we proceed as in Subcase l(a). 

This completes our proof. 
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Proof of Lemma 2 


Proof. Assume that (q, 01, 61) =y L and thus either there exists a finite unsuc- 
cessful trace t 


(4,01, 61) = (qo1, 701; 501) 25 e as, (Gn1;On1; Ôn1) (n 2 0) 


such that Vi € {1,... n} : gia Z Y and (qn1, Gni, Ôn1) is stuck, or there exists an 
infinite unsuccessful trace t 


d dn dn 
(q, 01, 61) = (qo1, o1, 501) SS (Gn1;On1; Ôn1) DE a 


such that Vi > 0: qi ZY. 
Assume now that all the traces from (q, 02, 62) to anode q’ € Y are successful, 
which means that (q, 02,82) #y L and thus by Proposition 1 the set 


d! d! 
{k | (q0; T0, 50) > oe = (dk Tk On) : (q0; %> 50) = (q, 02, 52) ^ Tk EYA 
Vi € {1,.. k1}: ad ZY} 


has a maximum m. 
The proof proceeds by contradiction where we show that we can either con- 
struct an unsuccessful trace of (q, o2, 02) or a “long” trace t’ 


7 1 


d d 
(q, 2, 52) = (do2, 702; 502) ey (M2, 712, 012) (l > 0) 


where Vi € {1,...,J} : qi2 ¢ Y and m < l and that would mean that this trace 
will either terminate later (at a node in Y) and thus it will have a length greater 
than m, or it will result into an unsuccessful trace. 

We consider two main cases one where q is in Q~w and one where it isn’t. 


Main Case 1: When q is in Quy. If the trace t of (q, 01, 61) visits only nodes that 
can reach Y (Vi: IIg,, #0) then we proceed similar to the proof of Main Case 1 
of Lemma. 1, using the result (a) and (b) of the fact proven there. Therefore if t 
is infinite we can show that (a2, 62) can simulate the first m steps of (01,61) and 
this give us the desired trace t’. Similarly, in case of t being a finite unsuccessful 
trace that stops at the node qn1, and (qn1, n1, Ôn1) is a stuck, we can also show 
that (02,62) can reach the node qnı (using the result (a)) and the resulting 
configuration will be stuck (using the result (6)). 

Now if the first j > 0 nodes qo1...qj1 (visited by t) can reach Y and then for 
the node q(j+41)1 we have that HagniY) = Ø, we can show similarly as before 
that (o2, 62) can reach the node q(j+41)1 (using the results (a) and (b)), and thus 
any further computation will lead to an unsuccessful trace since J EE i 0. 

Finally if t visits only nodes that cannot reach Y (Vi : Hy,, = 0) and thus 
also q cannot reach Y, the proof is trivial since all the traces of (q, 02,62) will 
be unsuccessful with respect to Y. This completes the proof of Main Case 1. 


Main Case 2: When q is not in QuU.w. We will now present a finite construction 
strategy for the desired trace t. 
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Construction. We start by looking at the configurations (q, 01,61), (q, 02, 62) the 
unsuccessful trace t of (c1, 61), and we remember that so far we have created a 
trace t = (q, 02, 62) of length | = 0. We proceed according to the following cases: 


Case 1: When the unique immediate Y post-dominator ipdy(q) of q is defined. 
We then consider two subcases, one where @, is false and one where ®, is true. 


Subcase (a): Pq is false. Now if the trace t does not visit ipdy (q), we have that 
(1,61) and (02, 62) have the same termination behaviour (using that ®, is false) 
and thus there exists a trace t of (02,82) that never visits ipdy (q). However, 
then we would have the case that t’ is an unsuccessful trace with respect to q 
and the set Y which contradicts our assumptions. 

If the trace t does visit ipdy (q), then it has to be the case that ipdy(q) is 
not in Y. Assume now that t starts with an edge e € Eq. If con(e) contains 
only low variables then dd, = dı and (012,612) = (011, 611) (this is ensured by 
our assumptions that secy,c(TA) and the predicate Ae of condition C2 (a) that 
takes care of the explicit flows arising from the assignment in the edge e) such 
that 


1 


d 
(q, 72,62) = (qo2, C02, 602) —> (G12, 012; 512) 


where q12 = q11. If now m <1+1 then we have our desired trace t’ and we stop. 

Otherwise, notice that also q11 is not in Q~w and we repeat the Construction 
by looking at the configurations (q11, 011,611), (11,012, 612), the suffix of t that 
starts with (q11, 011, 611) and we remember that so far we have created the trace 


ii 


dı 
i= (qo2; 702; 02) —* (q12, 012, 612) ((q, 72, 52) = (qo2, 702; 502)) 


that has length equal to /+1. 

Now if con(e) contains at least one high variable then we look at the first 
occurrence of ipdy (q) in t and let that to be the configuration (gp1, Ohi, 5n1) 
for some h > 0. Therefore, since secy,c(TA), using the condition C2 (a) and 
Fact 4 we have that Vr : L(x) = L > oni (x) = co1(£) and Vr: L(r) = L => 
dni(r) = do1(r) +d +... +dn. Since G, is false (o1, 61) and (a2, 62) have the same 
termination behaviour and thus there exists trace t’ € [TA : q > ipdy(q)] (a2, 62) 
and d},...,d such that dı +...+ dn = di + ... + d} and (052, 6;2) such that t’ is 

(q, 72, 62) = (402; 702; S02) ml ie (9525052, 552) 
where qj2 =ipdy (q). 

Now if j +1 > m we have constructed the required trace t. 

Otherwise, we have that Yx : L(x) = L > oj2(x) = o02(x) and Vr: L(r) = 
L = ĝj2(r) = do2(r) +d, +... +d}. To see how we obtain this result, we have that 
if t’ has started using the edge e or an edge e’ Æ e, where con(e’) contains at least 
one high variable, then this result follows by our assumptions that secy,c(TA), 
condition C2 (a) and Fact 4. Now if the t has started using an edge e’ Æ e and 
con(e’) has only low variables then (01,61) is a witness of sat(con(e) A con(e’)) 
and the result follows again by our assumptions that secy c (TA), condition C2 
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(b) and Fact 4. Therefore in any case (071, h1) = (072, 6j2) and thus we repeat 
the Construction by looking at the configurations (gni,0n1,5h1), (42,072; 552) 
the suffix of t that starts with (qn1, an1, ôh1) and we remember that so far we 
have created the trace t 


di d; 
(q, 72, 62) = (qo2, 702; 502) Su —> (952, 052, 552) 


of length equal to l+ j. 


Subcase (b): Pq is true. Then if t starts with the edge e, because secy.c(TA), 
con(e) contains only low variables and we proceed as in Subcase (a). 


Case 2: When the unique immediate Y post-dominator ipdy(q) of q is not 
defined. In this case, if t starts with the edge e, because secy,c(TA) we have 
that con(e) contains only low variables. Now if e ~> w working as in Main Case 
1 we can get an unsuccessful trace t’, otherwise we proceed as in Subcase (a). 


Proof of Theorem 1 


Proof. Let 


Z = {( (4,0, 8), (4,0,8) | IKD], 4) A alo, 8°) 
U{(L, 1)} 


It is immediate by Lemmas 1 and 2 that Z is a Y—bisimulation and that 
Vq € {qo} UY : V(o, 8), (0’, 6’) :[I(@](o,6) A [a], 0’) 


4 
((a, 0,8), (9,07, 6")) € Z 


Therefore since ~y is the largest Y —bisimulation we have that Z C~y and thus 
TA satisfies the information security policy £. 
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Abstract. Reasoning about information flow in a concurrent setting is 
notoriously difficult due in part to timing channels that may leak sensi- 
tive information. In this paper, we present a compositional and flexible 
type-and-effect system that guarantees non-interference by disallowing 
potentially insecure races that can be exploited through internal tim- 
ing attacks. In contrast to many previous approaches, which disallow all 
races on public variables, we use an explicit scheduler model to give a 
more permissive security definition and type system, which allows benign 
races on public variables. To achieve compositionality, we use the idea of 
resources from separation logic, both to locally specify and reason about 
whether accesses may be racy and to bound the security level of data 
that may be learned through scheduling. 


1 Introduction 


Non-interference [15] is an important security property. Informally, a program 
satisfies non-interference if its publicly observable (low) outputs are indepen- 
dent of its private (high) inputs. In spite of the vast body of research on 
non-interference, reasoning about information flow control and enforcing non- 
interference for imperative concurrent programs remains a difficult problem. One 
of the main problems is prevention of information flows that originate from inter- 
action of the scheduler with individual threads, also known as internal timing 
leaks. 


Example 1. Consider the following program [44]. 


fork(delay(50); J := 1); // Thread 1 
fork(if h then skip else delay(100); l := 2); // Thread 2 
1 delay(n) is used as an abbreviation for skip;...;skip n times, i.e., it models a com- 


putation that takes n reduction steps. 


© The Author(s) 2018 
L. Bauer and R. Kiisters (Eds.): POST 2018, LNCS 10804, pp. 53-78, 2018. 
https: //doi.org/10.1007/978-3-319-89722-6_3 
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In this program, h is a high variable and l is intended to be a low variable. 
But the order of the two assignments to l depends on the branch that is picked 
by Thread 2. As a result, under many schedulers, the resulting value of 1 = 1 
reveals the value of h being true to a low observer. 


It may appear that the problem in the above example is that Thread 2 races 
to the low assignment after branching on a secret. The situation is actually worse. 
Without explicit assumptions on the scheduling of threads, a mere presence of 
a high branching in the pool of concurrently running threads is problematic. 


Example 2. Consider the following program, which forks three threads. 


fork(delay(50); J := 1); // Thread 1 
fork(if h then skip else delay(100)); // Thread 2 
fork(1 := 2) // Thread 3 


In this program, every individual thread is secure, in the sense that it does not 
leak information about high variables to a low observer. Additionally, pairwise 
parallel composition of any of the threads is secure, too, including a benign race 
fork(l := 1); fork(l := 2). Even if we assume that the attacker fully controls 
the scheduler, the final value of l will be determined only by the scheduler of his 
choice. However, for the parallel execution of all the three threads, if the attacker 
can influence the scheduler, it can leak the secret value of h through public l. 


In this paper, we present a compositional and flexible type-and-effect system 
that supports compositional reasoning about information flow in concurrent pro- 
grams, with minimal assumptions on the scheduler. Our type system is based on 
ideas from separation logic; in particular, we track ownership of variables. An 
assignment to an exclusively-owned low variable is allowed as long as it does not 
create a thread-local information flow violation, regardless of the parallel con- 
text. Additionally, we introduce a notion of a labeled scheduler resource, which 
allows us to distinguish and accept benign races as secure.” A racy low assign- 
ment is allowed as long as the thread’s scheduler resource is low; the latter, in its 
turn, prevents parallel composition of the assignment with high threads, avoiding 
potential scheduler leaks. This flexibility allows our type system to accept pair- 
wise parallel compositions of threads from Example 2, while rightfully rejecting 
the composition of all three threads. 

Following the idea of ownership transfer from separation logic, our type sys- 
tem allows static transfer of resource ownership along synchronization primi- 
tives. This enables typing of programs that use synchronization primitives to 
avoid races, as illustrated in the following example. 


? One could argue that programs should not have any races on assignments at all; but 
in general we will want to allow races on some shareable resources (e.g., I/O) and 
that is why we study a setup in which we do try to accommodate benign races to 
assignments. 
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Example 3. Consider the following modification of Example 2. 


fork(delay(50); J := 1; send(c)); // Thread 1 
fork(if h then skip else delay(100)); // Thread 2 
recv(c); // recover exclusive ownership of variable | 
fork(l := 2) // Thread 3 


In this program, Thread 1 sends a message on channel c. Since the main 
program synchronizes on the c channel (by receiving on channel c), Thread 3 is 
not forked until after the assignment / := 1 in Thread 1 has happened. Hence, the 
synchronization ensures that there is no race on l and the program is therefore 
secure, even in the presence of the high branching in the concurrent Thread 2. 


Note that unconstrained transfer of resources creates an additional covert 
channel that needs to be controlled. Section 3 describes how our type system 
prevents implicit flows via resource transfer. 

One might expect that synchronization can also be used to allow races after 
high threads are removed from the scheduler. That is, however, problematic, as 
illustrated by the following example. 


Example 4. Consider the following program. 


fork(if h then sı else s2; send(c)); // Thread 1 
recv(c); 

fork(l := 1); // Thread 2 
fork(l := 2) // Thread 3 


The program forks off three threads and uses send(c) and recv(c) on a channel 
c to postpone forking of Thread 2 and 3 until after Thread 1 has finished. Here it 
is possible for the high thread (Thread 1) to taint the scheduler and thus affect its 
choice of scheduling between Threads 2 and 3 after Thread 1 has finished. This 
could, e.g., happen if we have an adaptive scheduler and sı and s2 have different 
workloads. Then the scheduler will be adapted differently depending on whether 
h is true or false and therefore the final value of | may reveal the value of h. 


To remedy this issue, we introduce a special rescheduling operation that 
resets the scheduler state, effectively removing all possible taint from past high 
threads. 


Example 5. Consider the following variation of Example 4: 


fork(if h then sı else s2; send(c)); // Thread 1 

recv(c); 

reschedule; // reset the scheduler state 
fork(l := 1); // Thread 2 


fork(l := 2) // Thread 3 
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The reschedule operation resets the scheduler state and therefore no informa- 
tion about the high variable h is leaked from the high thread and this program 
is thus secure. 


The above example illustrates that reschedule allows us to remove scheduler 
taint from high threads and thus accept programs with benign races as secure 
after high threads have finished executing. 


Contributions. This paper proposes a new compositional model for enforcing 
information flow security in imperative concurrent programs. The key compo- 
nents of the model are: 


— A fine-grained compositional? type-and-effect system that prevents internal 
timing leaks by tracking when races may occur and whether the scheduler 
state could be tainted with confidential information. The type-and-effect sys- 
tem allows us to verify programs with benign races as secure. 

— A novel programming construct for resetting the scheduler state. 

— A proof technique for termination-insensitive notion of security under possible 
low nondeterminism. 


We emphasize that our model is independent of the choice of scheduler; the 
only restriction on the runtime system is that it should implement the resched- 
ule operation for resetting the scheduler state. This is a very mild restriction. 
Compared to other earlier work that also allows for scheduler independence and 
benign low races, our type-and-effect system is, to the best of our knowledge, 
much more expressive in the sense that it allows to verify more programs as 
secure. 

The choice of termination-insensitive security condition as the target condi- 
tion is deliberate for we only consider batch-style executions. We believe that 
our results can be extended to progress-insensitive security [2] using standard 
techniques. Despite that termination-insensitive security conditions leak arbi- 
trary information [3], these leaks occur only via unary encoding of the secret 
in the trace and are relatively slow, especially when the secret space is large, 
compared to fast internal timing channels that we aim to close. We do not 
consider termination (or progress)-sensitivity because it is generally difficult to 
close all possible termination and crashing channels that may be exploited by 
the adversary, including resource exhaustion, without appealing to system-level 
mechanisms that also mitigate external timing channels. We discuss this more in 
detail in Sect.5. Finally, note that in this paper we only address leaks through 
interactions with the scheduler (i.e., the internal timing leaks). Preventing exter- 
nal leaks is an active area of research and is out of scope of the paper. 


Outline. The remainder of this paper is organized as follows. In Sect. 2, we 
formally define the concurrent language and our security model. In Sect. 3, we 


3 We use a standard notion of compositionality for separation-style type systems, see 
comments to Theorem 1. 
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present the type system for establishing security of concurrent programs. For 
reasons of space, an overview of the soundness proof and the detailed proof can 
be found in the accompanying appendix. We discuss related work in Sect. 5. 
Finally, in Sect. 6, we conclude and discuss future work. 


2 Language and Security Model 


We begin by formally defining the syntax and operational semantics of a core con- 
current imperative language. The syntax is defined by the grammar below and 
includes the usual imperative constructs, loops, conditionals and fork. Thread 
synchronization is achieved using channels which support a non-blocking send 
primitive and a blocking receive. In addition, the syntax also includes our novel 
reschedule construct for resetting the scheduler. 


v € Val ::= () |n |tt | ff 
e € Erp ::= q | v | e1 = ez | e1 + e2 
s € Stm ::= skip | 51; s2 | £x := e | if e then sı else s2 | while e do s 
| fork(s) | send(ch) | recv(ch) | reschedule 
K € ECta ::= 6 | K; 


Here x and ch range over finite and disjoint sets of variable and channel identi- 
fiers, respectively. The sets are denoted by Var and Chan, respectively. 

The operational semantics is defined as a small-step reduction relation over 
configurations of the form sf, S, T, M, p consisting of a scheduling function sf, a 
scheduler state S, a thread pool T, a message pool M and a heap p. A scheduling 
function sf takes a scheduler state, a thread pool, a message pool and a heap as 
arguments and returns a new scheduler state and a thread identifier of the next 
thread to be scheduled [30,33]. A thread pool T is a partial function from thread 
identifiers to sequences of statements, a message pool is a function from channel 
names to natural numbers, each representing a number of signals available on the 
respective channel, and a heap is a function from variables to values. We model 
a thread as a stack of statements, pushing whenever we encounter a branch 
and popping upon termination of branches. The semantic domains are defined 
formally in Fig. 1. 


def fin def 


T € TPool = TId = seq Stm sf € Schd = S x TPool x MPool x Heap + S x TId 
M € MPool = Chan + N W € ReSchd “= Schd x MPool x Heap + Schd x S x TId 


p € Heap ~ Var > Val 


Fig. 1. Semantic domains. 


The reduction relation is split into a local reduction relation that reduces 
a given thread and a global reduction relation that picks the next thread to be 
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T; M,p he T’,M’,p' sf (S, T; M, p) = (S", t) a a rs(-) 
sf, 5,T,M,p —_yw sf, S',T', M’, p 


T, M, p °° T',M', p W(sf, M, p) = (sf, S’, t) a = rs(t’) 
sf, S,T,M, p wy sf, S, T, M’, p 


Fig. 2. Global reduction relation. 


scheduled. The global reduction relation is defined in terms of the local reduction 
relation, written T, M, p -** T’, M’, p', which reduces the thread ¢ in thread 
pool T, emitting action a during the reduction. The global reduction relation 
only distinguishes between reschedule actions and non-reschedule actions. To 
reduce reschedule actions, the global reduction relation refers to a rescheduling 
function Y, which computes the next scheduler and scheduler state. The global 
reduction relation, written sf,S,T,M,p —>w sf’,$’,T’,M’, p’, is indexed by a 
rescheduling function ¥, which takes as argument the current scheduling func- 
tion, message pool and heap and returns a new scheduling function and scheduler 
state. The global reduction relation is defined formally in Fig. 2. 


T(t) = K[s] :: stk spas T(t) = skip :: stk 
T,M,p ="° ja]a(T[t => K[s"] :: stk], M, p, t) T, M,p =*" T[t => stk], M, p 


Fig. 3. Local reduction relation. 


The local reduction relation is defined over configurations consisting of a 
thread pool, a message pool and a heap (Fig. 3). It is defined in terms of a 
statement reduction relation, s,h —a s’ that reduces a statement s to s’ and 
emits an action a describing the behavior of the statement on the state. We 
use evaluation contexts, K, to refer to the primitive statement that appears 
in a reducible position inside a larger statement. We use K[s] to denote the 
substitution of statement s in evaluation context K. Actions include a no-op 
action, €, a branch action, b(e, s), an assignment action, a(x, v), a fork action, 
f(s), send and receive actions, s(ch), r(ch), a wait action for blocking on a receive 
w(ch), a reschedule action, rs(t), and a wait action for blocking on a reschedule, 
wa. Formally, 


a € Act ::= e | b(e, s) | a(x, v) | f(s) | s(ch) | r(ch) | w(ch) | wa | rs(t) 


The behavior of an action a on the state is given by the function [Ja] 4 defined in 
Fig. 4. The function tgen is used to generate a fresh thread identifier for newly 
forked threads. It thus satisfies the specification tgen(T) ¢ dom(T). We assume 
tgen is a fixed global function, but it is possible to generalize the semantics and 
allow the rescheduling function to also pick a new thread identifier generator. 
active(T) denotes the set of active threads in T, i.e., active(T) = {t € dom(T) | 
T(t) Æ €}. The statement reduction relation is defined in Fig. 5. 
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[eJa(T, M, p, t) = (T, M, p) 
[b(e, s)Ja(T, M, p, t) = (T[t > s :: T(t)], M, p) 
[a(z, v)]a (T, M, p, t) = (T, M, pix => v]) 
f(s)]a(T, M, p, t) = (T[tgen(T) > s], M, p) 
[s(ch)] 4(T, M, p, t) = (T, M[ch => M(ch) + 1], p) 
lr(ch)]a(T, M, p, t) = if M(ch) > 0 then (T, M[ch 4 M(ch) — 1], p) else L 
[w(ch)]a (T, M, p, t) = if M(ch) = 0 then (T, M, p) else L 
[rs(t’)] 4(T, M, p, t) = if |active(T)| = 1 then ([t’ 4 T(t)], M, p) else L 
waļa(T, M, p, t) = if |active(T)| > 1 then (T, M, p) else L 


Fig. 4. Semantics of actions. 


Note that semantics of events is deterministic. For example, r(ch)-transition 
can only be executed if M (ch) > 0, while w(ch) can only be emitted if M (ch) > 
0 (symbol L in the definition means “undefined” ). Note that reschedule only 
reduces globally once all other threads in the thread pool have reduced fully and 
that it further removes all other threads from the thread pool upon reducing 
and assigns a new thread identifier to the only active thread. This requirement 
ensures that once reschedule reduces and resets the scheduler state then other 
threads that exist prior to the reduction of reschedule cannot immediately taint 
the scheduler state again. The reschedule reduction step is deterministic: the 
value of t is bound in the respective rule in Fig. 2 by function Y. 


Example 6. To illustrate the issue, consider the following code snippet. This 
program branches on a confidential (high) variable h and then spawns one of 
two threads with the sole purpose of tainting the scheduler with the state of 
h. It also contains a race on a public (low) variable J, which occurs after the 
rescheduling. 


if h > 0 then fork(skip) else fork(skip; skip); 
reschedule; 
fork(l := 0); l := 1 


If reschedule could reduce and reset the scheduler state before the forked 
thread had reduced, then the forked thread could reduce between reschedule and 
the assignment and therefore affect which of the two racy assignments to l would 
win the race. Our operational semantics therefore only reduces reschedule once 
all other threads have terminated, which for the above example ensures that the 
forked thread has already fully reduced, and cannot taint the scheduler state 
after reschedule has reset it. 
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while e do s, p —« if e then (s; while e do s) else skip 


if e then sı else s2, p —>pb(e,s1) Skip if [e] (p) = tt 
if e then sı else s2, P —>»b(e,s2) Skip if [e] (p) = fF 
© = €, Pp a(x,v) Skip where v = [e] (p) 
skip; s, p >e $ recv(ch), p >w(cn) recv(ch) 
fork(s), p 4s) skip recv(ch), p >r(cn) Skip 
send(ch), p —>s(en) skip reschedule, p —>wa reschedule 


reschedule, p —>rs(+) skip 


Fig. 5. Statement reduction. 


2.1 Security Model 


In this section we introduce our formal security model for confidentiality. This is 
formalized as a non-interference property, requiring that attackers cannot learn 
anything about confidential inputs from observing public outputs. 

To express this formally, we assume a bounded L/-semilattice £ of security 
levels for classifying the confidentiality levels of inputs and outputs. We say that 
level 2; flows into lə if €; E 42. In examples we typically assume £ is a bounded 
lattice with distinguished top and bottom elements, denoted H and L, and 
referred to as high and low, respectively. Given a security typing I’ that assigns 
security levels to all program variables and channel identifiers, we consider two 
heaps pı and pə indistinguishable at attacker level 44 if the two heaps agree for 
all variables with a security level below or equal to the attacker security level: 


pı ~fa p2 “Vr e Var. I (x) E la = hi(2) = ho(z) 


Likewise, we consider two message pools Mı and Mg indistinguishable at attacker 
level £4 if they agree on all channels with security level below or equal to the 
attackers security level: 


Mı ~ft M {Ych € Chan. I'(ch) E £4 => My(ch) = Mo(ch) 


Non-interference expresses that attackers cannot learn confidential informa- 
tion by requiring that executions from attacker indistinguishable initial mes- 
sage pools and heaps should produce attacker indistinguishable terminal mes- 
sage pools and heaps, when executed from the same initial scheduler state 
and scheduling function. Since scheduling and rescheduling functions have com- 
plete access to the machine state, including confidential variables and channels, 
we restrict attention to schedulers and reschedulers that only access attacker- 
observable variables and channels. We say that a scheduler sf is an -scheduler 
iff it does not distinguish message pools and heaps that are 4-indistinguishable: 


L-level (sf) © VS, T, Mi, Mao, pı, p2- 
Mı ~h M2 A pı ~h p2 > sf (S,T, My, p1) = sf (S,T, Mə, p2) 
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Likewise, a rescheduling function is an ¢rescheduler iff it does not distin- 
guish message pools and heaps that are @indistinguishable and only returns 
é-schedulers: 


é-level(W) <= Vsf, Mı, M2, p1, p2. Elevel(m(W(sf, Mı, p1))) A 
(Mı wh Mə ^ pı ~f p2 => W(sf, Mı, p1) = W (sf, M2, p2)) 
where 7 is a projection to the first component of the triple. 


Definition 1 (Security). A thread poolT satisfies non-interference at attacker 
level £4 and security typing I" iff all fully-reduced executions from ¢,4-related 
initial heaps (starting with empty message pools) reduce to ¢,4-related terminal 
heaps, for all £4-level schedulers sf and reschedulers W: 


VP1 p2, P4, Po € Heap. YM] , M} € MPool.VS, S4, S € S.VT], Ts. sf sf 1, sfo- 
€4-level(sf) A Ca-level(W) A pı “4 po A final(T{) A final(TS) A 
sf ,S,T, Ach.0, pi —y sfis S1, Ti Mis Ph A 
sf, S, T, Ach.0, p2 —} Sfo, S3, T3, M3, py > Mi ~it M3 A pi ~i pb 


where final(T) = Vt € dom(T).T(t) = e€. 


This non-interference property can be specialized in the obvious way from 
thread pools to programs by considering a thread pool with only the given 
program. 

In our security model, we focus on standard end-to-end security, i.e., the 
attacker is allowed to observe low parts of the initial and final heaps. The security 
definition quantifies over all possible schedulers, which in particular means that 
the attacker is allowed to choose any scheduler. 

To develop some intuition about our security model, let’s consider a few 
basic examples. The program fork(x := 1); x := 2 is secure for any attacker 
level £4, because in any two executions from the same initial scheduler state 
and ¢,4-equivalent initial message pools and heaps, the scheduler must schedule 
the assignments in the same order. This follows from the assumption that the 
scheduler cannot distinguish @4-equivalent message pools and heaps. 

If prior to a race on a low variable a thread branches on confidential informa- 
tion, then we can construct a scheduler that leaks this information. To illustrate, 
consider the following variant of Example 1 from the Introduction: 


fork(if h then skip else (skip; skip)) // Thread 1 
fork(l := 1); // Thread 2 
fork(l := 2) // Thread 3 


If we take the scheduler state to be a natural number corresponding to the 
number of statements reduced so far, then we can construct a scheduler that first 
reduces Thread 1 and then schedules Thread 2 if Thread 1 was fully reduced in 
two steps and Thread 3 if Thread 1 was fully reduced in three steps. Therefore, 
this program is not secure. 
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3 Type System 


In this section we present a type-and-effect system for establishing non- 
interference. The type-and-effect system is inspired by separation logic [36] and 
uses ideas of ownership and resources to track whether accesses to variables and 
channels may be racy and to bound the security level of the data that may be 
learned through observing how threads are scheduled. Statements are typed rel- 
ative to a pre- and postcondition, where the precondition describes the resources 
necessary to run the statement and the postcondition the resources owned after 
executing the statement. The statement typing judgment has the following form: 


D|A| pe {P} s {Q} 


Here P and Q are resources and pc is an upper bound on the security level 
of the data that can be learned through knowing the control of the program 
up to this point. Context I defines security levels for all program variables and 
channel identifiers and A defines a static resource specification for every channel 
identifier. We will return to these contexts later. Expressions are typed relative 
to a precondition and the expression typing judgment has the following form: 
I H {P} e: 4. Here @ is an upper bound on the security level of the data 
computed by e. Resources are described by the following grammar: 


P,Q ::= emp | P xQ | £x | chy | schd, (£ | [P]* 


where m € QN (0, 1]. The emp assertion describes the empty resource that does 
not assert ownership of anything. The P * Q assertion describes a resource that 
can be split into two disjoint resources, P and Q, respectively. This assertion is 
inspired by separation logic and is used to reason about separation of resources. 

Variable resources, written z,, express fractional ownership of variable x with 
fraction 7 € QN (0, 1]. We use these to reason locally about whether accesses to a 
given variable might cause a race. Ownership of the full fraction 7 = 1 expresses 
that we own the variable exclusively and can therefore access the variable without 
fears of causing a race. Any fraction less than 1 only expresses partial ownership 
and accessing the given variable could therefore cause a race. These variable 
resources can be split and recombined using the fraction. We express this using 
the resource entailment judgment, written I’ F P => Q, which asserts that 
resource P can be converted into resource Q. We write IT F P & Q when 
resource P can be converted into Q and Q can be converted into P. Splitting and 
recombination of variable resources comply with the rule: If mı +72 < 1 then I F 
Late ®© Lry *Lz,. This can for instance be used to split an exclusive permission 
into two partial permissions that can be passed to two different threads and later 
recombined back into the exclusive permission. 

The other kind of crucial resources, schd, (£), where 7 € QN (0, 1], allows us 
to track the scheduler level (also called the scheduler taint). A labeled scheduler 
resource, schd,(@), expresses that the scheduler level currently cannot go above 
£. This is both a guarantee we give to the environment and something we can rely 
on the environment to follow. This guarantee ensures that level of information 
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that can be learned by observing how threads are scheduled is bounded by 
the scheduler level. Again, we use fractional permissions to split the scheduler 
resource between multiple threads: If mı + m2 < 1 then I F schd,,47,(0) @ 
schdr, (£) x schdr, (£). If we own the scheduler resource exclusively, then no one 
else is relying on the scheduler level staying below a given security level and we 
can thus change the scheduler rely-guarantee level to a higher security level: If 
Lı E b then T F schd,(€:) = schd,(£2). In general it is not secure to lower the 
upper bound on the scheduler level in this way, even if we own the scheduler 
resource exclusively. Instead, we must use reschedule to lower the scheduler level. 
We will return to this issue in a subsequent section. 


T-SKIP 


I'| A| pct {P} skip {P} 


P\Alpeh {P} a {R} LIA pct {R} s2 {Q} 


T-SEQ 
D\A| per {P} 1; 52 {Q} 


Tr{Phe:£ 
P = R x schd; (£s) LEk T'|A| pelt {P} si {Q} fort € {1,2} | 
T | A| pet {P} if e then sı else s2 {Q} g 


F 


TrH{P}e:2 
P= Rx schdr(ls) €C%, F|A|peult {P} s {P} 
I| A| pct {P} while e do s {P} 


T-WHILE 


THAP ert T-P t pcUlLE T(z) 


T-ASGN-EXCL 
T| A]|pcH {P} z := e {P} 


Tr{Phe:£ P=Rxschd,,(€s) D[bP>a, pelle, CT (sz) 
I| A|pcH {P} z := e {P} 


T-AsGN-RACY 


Fig. 6. Typing rules for assignments and control flow statements. 


State and Control Flow. Before introducing the remaining resources, let’s look at 
the typing rules for assignments and control flow primitives, to illustrate how we 
use these variable and scheduler resources. The type-and-effect system features 
two assignment rules, one for non-racy assignments and one for potentially racy 
assignments (T-AsSGN-EXCL and T-ASGN-RAcy, respectively, in Fig. 6). If we 
own a variable resource exclusively, then we can use the typing rule for non-racy 
assignments and we do not have to worry about leaking information through 
scheduling. However, if we only own a partial variable resource for a given vari- 
able, then any access to the variable could potentially introduce a race and we 
have to ensure information learned from scheduling is allowed to flow into the 
given variable. The typing rule for potentially racy assignments (T-AsGN-RACY) 
thus requires that we own a scheduler resource, schd,(¢), that bounds the infor- 
mation that can be learned through scheduling, and requires that Zs may flow 
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into I(x). Both assignment rules naturally also require that the security level 
of the assigned expression and the current pc-level is allowed to flow into the 
assigned variable. The assigned expression is typed using the expression typing 
judgment, [+ {P} e: £, using the rules from Fig. 7. This judgment computes 
an upper-bound £ on the security-level of the data computed by the expression 
and ensures that P asserts at least partial ownership of any variables accessed 
by e. Hence, exclusive ownership of a given variable z ensures both the absence 
of write-write races to the given variable, but also read-write races, which can 
also be exploited to leak confidential information through scheduling. 


T-SuB T-CONST T-VAR 
Tr{Phe:44 £, C bo PrP = & 
PEPEE b FEAP oL rH{P}x:T(£) 


PFEP of TE{P}e:f THAP ré EFP k 
{P} e {P} eit Anp {P} e {P} e T-EQ 


rH{P}ea+e:L£ TH{Phea nmel 


Fig. 7. Typing rules for expressions. 


r|A|pcH {P} 5 {Q} 
I| A|pcH{P*R}s{Q* R} 


T-FRAME 


Ct Pi >P T| A| pes + {P2} s {Q2} TFQ2>Q1 pe, E pe, 
T| A| pe, F {Pi} s {Qi} 


T-CONSEQ 


Fig. 8. Structural typing rules. 


The typing rules for conditionals and loops (T-IF and T- WHILE) both require 
ownership of a scheduler resource with a scheduler level 4; and this scheduler level 
must be an upper bound on the security level of the branching expression. The 
structural rule of consequence (T-CONSEQ in Fig. 8) allows to strengthen precon- 
ditions and weaken postconditions. In particular, in conjunction with resource 
implication rules Fig.9, it allows to raise the level of scheduler resource, which 
is necessary to type branching on high-security data. 


Spawning Threads. When spawning a new thread, the spawning thread is able 
to transfer some of its resources to the newly created thread. This is captured 
by the T-FORK rule given below, which transfers the resources described by P 
from the spawning thread to the spawned thread. 


P| Al pet {P} s {Q} 
I| A| pct {P} fork(s) {emp} 
Naturally, the newly spawned thread inherits the pc-level of the spawning thread. 
Upon termination of the spawned thread, the resources still owned by the 
spawned thread are lost. To transfer resources back to the spawning thread 
or other threads requires synchronization using channels. 


T-FORK 
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CF P=>Pxemp TFP*Q=>Q*P DF (P*Q)*R&P*(Q*R) 
TEP SQ TFQ=>R Tth&PsQ 
FEP*Q=>P TFP>}R FFP*«R>Q*R 
Tm +72 <1 Tm +72 <1 
DF Gey ¥ Lg Gry tary I F schdr, (£) * schdr, (£) > schdr, +r (8) 
£,C bo 


rr schdı (£1) => schd, (£2) 


Fig. 9. Resource implication rules. 


Synchronization. From the point of view of resources, synchronization is about 
transferring ownership of resources between threads. When sending a message 
on a channel, we relinquish ownership of some of our resources, which become 
associated with the message until it is read. Conversely, when reading from a 
channel the reader may take ownership of a part of the resource associated with 
the message it reads. The A context defines a static specification for every chan- 
nel identifier that describes the resources we wish to associate with messages on 
the given channel. If A(ch) = P, then we must transfer resource P when sending 
a message on channel ch. However, when receiving a message from channel ch, 
we might only be able to acquire part of P, depending on whether our receive 
may race with other receives to acquire the resources and how our pc-level relates 
to the pc-level of the sender of the message and to the potential scheduler taint. 

To capture this formally, our type-and-effect system contains channel 
resources, written Chy, erased resources, written [P], and channel security lev- 
els, ['(ch). Like variable resources, channel resources allow us to track whether 
a given receive operation on a channel might race with another receive on the 
same channel using a fraction 7. To receive on a channel ch requires fractional 
ownership of the corresponding channel resource. The channel resource can be 
split and recombined freely: [°F chr,42, © Chr, * che,, with the full fraction, 
am = 1, indicating an exclusive right to receive on the given channel. The erased 
resource, [P]*, is used to erase variable and channel resources in P with secu- 
rity levels that are not greater than or equal to the security level 2. To illustrate 
how we use these features to type send and receive commands, let us start by 
considering an example that is not secure, and that should therefore not be 
typeable. 

We start with the simpler case of non-racy receives. In the case of non-racy 
receives, we have to prevent ownership transfer of low variables from a high 
security context to a lower security context. This is illustrated by the program 


fork(if h then send(a) else send(b)); 
fork(recv(a); l := 1; send(b)); 
fork(recv(b); l := 2; send(a)) 
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This code snippet spawns a thread which sends a message on either channel 
a or b depending on the value of the confidential variable h. Then the program 
spawns two other threads that wait until there is an available message on their 
channel, before they write to l and message the other thread that it may pro- 
ceed. This code snippet is insecure, because if h is initially true, then the public 
variable | will contain the value 2 upon termination and if h is initially false, 
then / will contain the value 1. 


pe C I'(ch) 


T | A| pct {A(ch)} send(ch) {emp} T-SEND 


P = R x schdr, (ls) pc ET (ch) E Ls r-P > ch 
T | A| peH {P} recv(ch) {P * [A(ch)] 7 ™} 


T-RECV-EXCL 


P = R x schd,, (£s) pc E T (ch) = £s IFP > chr 


T T-Recv-RACY 
I| A| pet {P} recv(ch) {P x [A(ch)| >} 


Fig. 10. Typing rules for synchronization primitives. 


To type this program, the idea would be to transfer exclusive ownership of 
the public variable / along channels a and b. However, our type system prevents 
this by erasing the resources received along channels a and b at the high security 
level, because the first thread may send messages on a and b in a high security 
context (i.e., with a high pc-level). 

Formally, the typing rules for send and for exclusive receives are given by T- 
SEND and T-RECV-EXCL in Fig. 10. The send rule requires that the security level 
of the channel is greater than or equal to the sender’s pc-level and the exclusive 
receive rule erases the resources received from the channel using the security-level 
of the channel. This means that the second and third threads do not get exclusive 
ownership of the l variable and that we therefore cannot type the subsequent 
assignments. The exclusive receive rule also requires fractional ownership of the 
scheduler resource and that the bound on the taint on the scheduler level is 
greater than or equal to the channel security level when receiving on a channel. 
This condition is related to the use of reschedule and we will return to this 
condition later. 


Example 7. To illustrate how to use these rules for ownership transfer, consider 
the following variant of the examples from the introduction. 


ex7 = fork(if h then sı else s2); /* high computation */ 
fork(l := 1; send(c)); 
recv(c); l := 2 


It forks off a thread that does a high computation and potentially taints 
the scheduler with confidential information. The main thread also forks off a 
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new thread that performs a write to public variable l, before itself writing to l. 
However, a communication through channel c in between these two assignments 
ensure that they are not racy and therefore do not leak private information for 
any chosen scheduling. We can, for instance, type this example as follows: 


P| A|LE {ey * l * hy * schdı(L)} ex7 {cy * lı * schdi (H)} 


where I’ and A are defined as follows: F'(I) = r(e) = L, r(h) = H, and 
A(c) = l. 

This typing requires the main thread to pass exclusive ownership of l to the 
second thread upon forking, which is then passed back on channel c. Since we only 
send and receive on channel c in a low context, we can take the channel security 
level to be low for c. When the main thread receives a message on c it thus takes 
ownership of [1,]/ and since I'(c) = L, it follows that P+ [h] © > h. The 
main thread thus owns the variable resource for l exclusively when typing the 
second assignment. 


We use the resource implication rules in Fig.11 to reason about erased 
resources, by pulling resources out of the erasure. For instance, if the security 
level of a variable x is greater than or equal to the erasure security level, then 
we can pull it out of the erasure: if L E T(x) then + [zr] > zr; and likewise 
for channel resources: if ££ I'(ch) then I F [chr]! => chr. Resources that 
cannot be pulled out of the erasure cannot be used for anything; owning [z, |‘ 
where I(x) Z £ is thus equivalent to owning emp. The full set of erasure impli- 
cation rules is given in Fig. 11. Notice that scheduler resources never get erased: 
T F [schd,(ls)]£ = schd,(€,). Moreover, the resource erasure is idempotent and 
distributes over the star operator. 


€CI(z) €CI(ch) 
D+ [aq]! > tm Db [chr]? > chr T F [schdx(ls)]° > schdz (Es) 
reer! = [PI T H [Pi * Pol’ > [Pil’ * [Pal 


Fig. 11. Erasure implication rules 


Racy Synchronization. In the case of racy receives, where we have multiple 
threads racing to take ownership of a message on the same channel, we have to 
restrict which resources the receivers can take ownership of even further. This is 
best illustrated with another example of an insecure program. The following is 
a variant of the earlier insecure program, but instead of sending a message on a 
channel in a high context it sends a message on a channel in a low context after 
the scheduler has been tainted and the scheduler level has been raised to high. 
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if h then skip else (skip; skip); 
send(c); 

fork(recv(c); l := 1; send(c)); 
recv(c); | := 2; send(c) 


With a suitably chosen scheduler, the initial value of the confidential variable 
h could decide which of the two racy receives will receive the initial message on 
c and thereby leak the initial value of h through the public variable l. We thus 
have to ensure that this program is not typeable. Our type system ensures that 
this is the case by requiring the scheduler level to equal the channel security level 
when performing a potentially racy receive. In the case of the example above, 
the scheduler level gets high after the high branching and is still high when we 
type check the two receives; since they are racy we are forced to set the security 
level of channel c to high—see the typing rule T-RECv-RAcy for racy receives 
in Fig. 10—which ensures we cannot transfer ownership of the public variable l 
on c. This in turn ensures that we cannot type the assignments to l as exclusive 
assignments and therefore that the example is not typeable. 


Reschedule. Recall that if we own the scheduler resource exclusively, then we 
can freely raise the upper bound on the security level of the scheduler, since 
no other threads are relying on any upper bound. In general, it is not sound 
to lower this upper bound, unless we can guarantee that the current scheduler 
level is less than or equal to the new upper bound. This is exactly what the 
reschedule statement ensures. The typing rule for reschedule (T-RESCHED given 
below) thus requires exclusive ownership of the scheduler resource and allows us 
to change this upper bound to any security level we wish. To ensure soundness, 
we only allow reschedule to be used when the pc-level is Lg, the bottom security 
level of the semilattice of security elements. 


T-RESCHED 


T| A| Let {schd(é1)} reschedule { schd (£2) } 


Example 8. To illustrate how the typing rule for reschedule is used, consider the 
following code snippet from the introduction section: 


exs = if h then skip else (skip; skip); 
reschedule; 
fork(l := 0); l := 1 


Recall that this snippet is secure, since reschedule resets the scheduler state 
before the race on l. We can, for instance, type this example as follows: 


P| A|LE {l «hy * schdı(L)} exs {lı * schdı (L)} 


with T'(l) = L and I(h) = H. 
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To type this example we first raise the upper bound on the scheduler level 
from low to high, so that we can branch on confidential h. Then we use T- 
RESCHED to reset it back to low after reschedule. At this point we split both the 
scheduler and variable resource for variable l into two, keep one part of each for 
the main thread and give away one part of each to the newly spawned thread. 
The two assignments to l are now typed by T-AsGn-Racy rule. 


Example 9. To illustrate why we only allow reschedule to be used at pc-level 
te, consider the following example, which branches on the confidential variable 
h before executing reschedule in both branches. 


fork(if h then (reschedule; skip) else (reschedule; skip; skip)); 
fork(l := 0); l := 1 


Despite doing a reschedule in both branches, the subsequent statements in the 
two branches immediately taint the scheduler with information about h again, 
after the scheduler has been reset. This example is thus not safe. 

In the full version of the paper, the reader will find several more intricate 
examples justifying the constraints of the rules. 


Precision of the Type System. Notice that mere racy reading or writing from/to 
variables does not taint the scheduler. For example, programs 


fork(l := 1); fork(m := l); fork(h := 0); h := 1 
fork(l := 0); h:= h+ 1; l:=1 
if l then h := 0 else h := 1; (fork(l := 0); l := 1) 


where l, m are low variables and h is a high variable, are all secure in the sense 
of Definition 1 and are typable. Indeed, there is no way to exploit scheduling 
to leak the secret value h in either of these programs. The scheduler may get 
tainted only if a high branch or receiving from a high channel is encountered, 
since the number of computation steps for the remaining computation (and hence 
its scheduling) may depend on a secret value as, for example, in the program 
while h do h := h — 1; (fork(l := 0); l := 1). This example is rejected by our type 
system. To re-enable low races in the last example, rescheduling must be used: 


while h do h := h — 1; reschedule; (fork(l := 0); l := 1) 


The last example is secure and accepted by the type system. 

Limitations of our type system include imprecisions such as when both 
branches of a secret-dependent if-statement take the same number of steps, e.g., 
if h then skip else skip; (fork(l := 0); l := 1), and standard imprecisions of flow- 
insensitive type-based approaches to information flow that reject programs such 
as in if h then l := 0 else l := 0 or in (if h then J := 0 else l := 1); l := 42. 
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Language Extensions. We believe that the ideas of this section can be extended 
to richer languages using standard techniques [17,32,51]. In particular, to han- 
dle a language with procedures we would use a separate environment to record 
types for procedures, similarly to what is done in, e.g., [34]. (In loc. cit. they did 
not cover concurrency; however, we take inspiration from [12] which presents a 
concurrent separation logic for a language with procedures and mutable stack 
variables.) Specifications for procedures would involve quantification over vari- 
ables and security levels. 


4 Soundness 


Let T be a thread pool and let P, Q map every thread identifier to t € dom(T) to 
a resource. We write | At {P} T {Q} if P(t) and Q(t) are typing resources for 
every thread T(t) with respect to I and A. We say that resource R is compatible 
if implication T F ®ze Var®@1 * Dehe Chan Chi * schd,(L) = R is provable. 


Theorem 1 (Soundness). Let T | At {P}T {Q} such that the composition 
of all the resources in P is compatible, then T satisfies non-interference for all 
attacker levels £4. 


Notice that the theorem quantifies universally over all attacker levels £44, hence, 
one typing is sufficient to guarantee security against all possible adversaries. 

As a direct corollary from the theorem, we obtain a compositionality prop- 
erty for our type-and-effect system: Given two programs sı, S2 typable with 
preconditions P, and P}, respectively, if P) x Py is compatible then the parallel 
composition of the two programs is typable with precondition P} * P2. 

Our soundness proof is inspired by previous non-interference results proved 
using a combination of erasure and confluence‘ for erased programs, but requires 
a number of novel techniques related to our reschedule construct, scheduler 
resources and support for benign races. A proof of Theorem1 can be found 
in the full version of the paper. 


5 Related Work 


The problem of securing information flow in concurrent programs has received 
widespread attention. We review the relevant literature along the following three 
dimensions: 


(1) Scheduler-(in)dependence. Sabelfeld and Sands [41] argue for importance of 
scheduler independence because in practice it may be difficult to accommo- 
date for precise scheduler behavior under all circumstances, and attackers 
aware of the scheduler specifics can use that knowledge to their advantage, 


4 A property which guarantees that a given program can be reduced in different orders 
but yields the same result (up to a suitable equivalence relation). 
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also known as refinement attacks. However, designing a scheduler indepen- 
dent enforcement technique that is also practical comes at a price of addi- 
tional restrictions. To this extent, a number of approaches gain permissive- 
ness via scheduler support. This is manifested either as an assumption on a 
particular scheduling algorithm, i.e., round-robin, or scheduler awareness of 
security levels of the individual threads. 

Permissiveness w.r.t. low races. We are interested in seeing which of the 
approaches support benign low non-determinism and permit low races. We 
believe this is an important factor from a practical perspective, because an 
approach capable of handling low races has the potential of scaling to prac- 
tical settings where parallel access, without extra synchronization overhead, 
to a single attacker-observable resource, such as network I/O, is desirable. 
Termination-(in)sensitivity. In sequential programs, ignoring leaks via pro- 
gram divergence is often a pragmatic choice, because the attacker is limited 
in how much information can be learned via the termination channel [3]. 
Can this pragmatic argument be carried over to a concurrent setting? On 
the one hand, malicious code with privileges to spawn threads may efficiently 
leak an N-bit secret by creating N threads and assigning every thread to 
leak a specific secret bit via the thread’s termination behavior [48]. Moti- 
vated by this, many approaches reject programs that may potentially diverge 
depending on a secret. On the other hand, while it is possible to use tech- 
niques from literature on program termination to improve precision of the 
enforcement [29], a pragmatic attacker can instead use provably-terminating 
programs that take as much time as it is necessary for them to make their 
observations. So, for malicious code, one really needs to focus on the timing. 
But controlling timing behavior is difficult already in sequential programs, 
because many runtime aspects that have no source-level representation are 
in play, including hardware caches [50], memory management [35], or lazy 
evaluation [11]. 


Another reason for our attention on termination-(in)sensitivity is that it is 


our experience that technical restrictions that impose termination (or timing)- 
sensitivity often simplify soundness proofs. Without such restrictions, prov- 
ing soundness for a (weaker) termination-insensitive definition can be more 
laborious. 


Scheduler-dependent or restricted to|Scheduler-independent 
particular scheduler classes 
Low races} TI: [30] TI: [14] (whole-program), « 
allowed TS: [9, 38, 25, 39, 7, 24, 45, 4, 10] | TS: [41] (+timing-sensitive) 
Low races|- TI: [49, 16, 46] 
forbidden TS: [16, 26] 


Fig. 12. Summary of the related work w.r.t. permissiveness of the language-based 
enforcement and scheduler dependence. TI stands for termination-insensitive; TS 
stands for termination-sensitive. 
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Figure 12 presents a high-level summary of the related work. The figure is 
by no means exhaustive and lists only a few representative works; we discuss 
the other related papers below. Observe how the literature is divided across 
two diametric quadrants. Approaches that prioritize scheduler independence are 
conservative in their treatment of low races. Approaches that do permit low races 
require specific scheduler support are confined to particular classes of schedulers. 
We discuss these quadrants in detail, followed by the discussion of rely-guarantee 
style reasoning for concurrent information flow and rescheduling. 


5.1 Scheduler-Independent Approaches 


Observational Determinism. The approach of preventing races to individual 
locations is initiated in the work on observational determinism by Zdancewic 
and Myers [49] (which itself draws upon the ideas of McLean [27] and Roscoe 
[37]). Subsequent efforts on observational determinism include the work by Huis- 
man et al. [16] and by Terauchi [46]. Here, Huisman et al. [16] identify an issue 
in the Zdancewic and Myers’ [49] definition of security—they construct a leaky 
program within the intended attacker model, i.e., not exploiting termination or 
timing, that is accepted by the definition (though it is ruled out by the type 
system). They also propose a modified definition and show how to enforce that 
using self-composition [8]. Terauchi’s [46] paper presents a capability system 
with an inference algorithm for enforcing a restricted version of the Zdancewic 
and Myers’ [49] definition. 

Out of these, the work by Terauchi [46] is the closest to ours because of 
the use of fractional permissions, but there are important differences in the 
treatment of the low races and the underlying semantic condition. Terauchi’s 
[46] type system is motivated by the design goal to reject racy programs of the 
form l := 0 | 1 := 1. This is done through tracking fractional permissions on so- 
called abstract locations that represent a set of locations whose identity cannot 
be separated statically. Our type system uses fractional permissions in a similar 
spirit, but has additional expressivity, (even without the scheduler resource), 
because Terauchi’s [46] typing also rules out programs such as 1, := 0 | l := 1, 
even when lı and lə are statically known to be non-aliasing. This is because the 
type system has a restriction that groups all low variables into a single abstract 
location. While this restriction is a necessity if the attacker is assumed to observe 
the order of individual low assignments, this effectively forces synchronization of 
all low-updating threads, regardless of whether the updates are potentially racy 
or not. We do not have such a restriction in our model. 

We suspect that lifting this restriction in the Terauchi’s [46] system to accom- 
modate a more permissive attacker model such as ours may be difficult without 
further changes to the type system, because their semantic security condition, 
being a variant of the one by Zdancewic and Myers [49], requires trace equiva- 
lence up to prefixing (and stuttering) for all locations in the set of the abstract 
low location. Without the typing restriction, the definition would appear to have 
the same semantic issue discovered by Huisman et al. [16]; the issue does not 
manifest itself with the restriction. 
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Note that adapting the security condition proposed by Huisman et al. [16] 
into a language-based setting also appears tricky. The paper [16] presents both 
termination-insensitive and termination-sensitive variants of their take on obser- 
vational determinism. The key changes are the use of infinite traces instead of 
finite ones and requiring trace equivalence instead of prefix-equivalence (up to 
stuttering). Terauchi [46] expresses their concerns w.r.t. applicability of this 
definition ([46], Appendix A). We think there is an additional concern w.r.t. 
termination-insensitivity. Because the TI-definition requires equivalence of infi- 
nite low traces it rejects a program such as 


l := 1; while secret = 1 do skip; l := 2; while secret = 2 do skip 


This single-threaded program is a variant of a brute-force attack that is usually 
accepted by termination-insensitive definitions [3] and language-based techniques 
for information flow. We, thus, agree with the Terauchi’s [46] conclusion [46] 
that enforcing such a condition via a type-based method without being overly 
conservative may prove difficult. 

By contrast, our approach builds upon the technique of explicit refiners [30, 
33], which allows non-determinism as long as it is not influenced by secrets, and 
does not exhibit the aforementioned semantic pitfalls. 

Whole program analysis can be used to enforce concurrent non-interference 
with a high precision. Giffhorn and Snelting [14] use a PDG-based whole program 
analysis to enforce relaxed low-security observational determinism (RLSOD) in 
Java programs. RLSOD is similar to our security condition in that it allows 
low-nondeterminism as long as it does not depend on secrets. 


Strong Security. Sabelfeld and Sands [41] present a definition of strong secu- 
rity that is a compositional semantic condition for a natural class of sched- 
ulers. The compositionality is attained by placing timing-sensitivity constraints 
on individual threads. This condition serves as a foundation for a number of 
works [13,19,22]. To establish timing sensitivity, these approaches often rely on 
program transformation [1,6,19,28]. A common limitation of the transformation- 
based techniques is that they do not apply to programs with high loops. Another 
concern is their general applicability, given the complexity of modern runtimes. 
A recent empirical study by Mantel and Starostin [23] investigates performance 
and security implications of these techniques, but as an initial step in this direc- 
tion the paper [23] has a number of simplifying assumptions, such as disabled 
JIT optimizations and non-malicious code. 


5.2 Scheduler-Dependent Approaches 


Scheduler-dependent approaches vary in their assumptions on the underlying 
scheduler. Boudol and Castellani [9] study system and threads model where the 
scheduler code is explicit in the program source; a typing discipline regulates the 
secure interaction of the scheduler with the rest of the program [5]. 
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Security-aware schedulers [7,38] track security levels of the program counters 
of each thread, and provide the interface that timing of high computations is not 
revealed to the low ones; this interface is realized by suspending all low threads 
when there is an alive high thread. 

A number of approaches assume a particular scheduling strategy, typically 
round-robin [30,39,45]. Mantel and Sudbrock [24] define a class of robust sched- 
ulers as the schedulers where “the scheduling order of low threads does not 
depend on the high threads in a thread pool” [24]. The class of robust sched- 
ulers appears to be large enough to include a number of practical schedulers, 
including round-robin. Other works rely on nondeterministic [4,8,21,25,40, 44] 
or probabilistically uniform [10,43,47] behavior. 


5.3 Rely-Guarantee Style Reasoning for Concurrent Information 
Flow and Rescheduling 


Rely-Guarantee Style Reasoning. Mantel et al. [26] develops a different rely- 
guarantee style compositional approach for concurrent non-interference in flow- 
sensitive settings. In this approach, permissions to read or write variables are 
expressed using special data access modes; a thread can obtain an exclusive read 
access or an exclusive write access via the specific mode. Note that the modes are 
different from fractional permissions, because, e.g., an exclusive write access to a 
variable does not automatically grant the exclusive read access. The modes also 
do not have a moral equivalent of the scheduler resource. Instead, the paper [26] 
suggests using an external may-happen-in-parallel global analysis to track their 
global consistency. Askarov et al. [4] give modes a runtime representation, and 
use a hybrid information flow monitor to establish concurrent non-interference. 
Liet al. [20] use rely-guarantee style reasoning to reason about information flows 
in a message-passing distributed settings, where scheduler cannot be controlled. 
Murray et al. [31] use mode-based reasoning in a flow-sensitive dependent type 
system to enforce timing-sensitive value-dependent non-interference for shared 
memory concurrent programs. 


Rescheduling. The idea of barrier synchronization to recover permissiveness of 
language-based enforcement appears in papers with possibilistic scheduling [4, 
25]. The rescheduling however does more than simple barrier synchronization— 
it also explicitly resets the scheduler state, which is crucial to avoid refinement 
attacks. The reason that simple barrier synchronization is insufficient is that 
despite synchronization at the barrier point, the scheduler state could be tainted 
by what happens before threads reach the barrier. For example, if the scheduler 
is implemented so that, after the barrier, the threads are scheduled to run in the 
order they have arrived to the barrier then there is little to be gained from the 
barrier synchronization. 

Operationally, the reschedule is implementable in a straightforward man- 
ner, which is much simpler than security-aware schedulers [7,38]. We 
note that rescheduling allows programmers to explore the space of perfor- 
mance/expressivity without losing security. A program that type checks without 
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reschedule, because there are no dangerous race conditions, does not need to 
suffer from the performance overhead of the rescheduling. Programmers only 
need to add the reschedule instruction if they wish to re-enable low races after 
the scheduler was tainted. In that light, rescheduling is no less practical than 
the earlier mentioned barrier synchronization [4]. 

While on one hand the need to reschedule appears heavy-handed, we are not 
aware of other techniques that re-enable low races when the scheduler can be 
tainted. How exactly the scheduler gets tainted depends on the scheduler imple- 
mentation/model. Presently, we assume that any local control flow that depends 
on secrets may taint the scheduler. This conservative assumption can naturally 
be relaxed for more precise/realistic scheduler models. Future research efforts will 
focus on refining scheduler models to reduce the need for rescheduling and/or 
automatic placement of rescheduling to lessen the burden on programmers. The 
latter can utilize techniques from the literature on the automatic placement of 
declassifications [18]. 


5.4 This Work in the Context of Fig. 12 


Developing a sound compositional technique for concurrent information flow that 
is scheduler-independent, low-nondeterministic, and termination-insensitive at 
the same time—a point marked by the star symbol in Fig. 12—is a tall order, 
but we believe we come close. Our only non-standard operation is reschedule 
that we argue has a simple operational implementation and can be introduced 
to many existing runtimes. 


6 Conclusion and Future Work 


In the paper, we have presented a new compositional model for enforcing infor- 
mation flow security against internal timing leaks for concurrent imperative pro- 
grams. The model includes a compositional fine-grained type-and-effect system 
and a novel programming construct for resetting a scheduler state. The type sys- 
tem is agnostic in the level of adversary, which means that one typing judgment 
is sufficient to ensure security for all possible attacker level. We formulate and 
prove the soundness result for the type system. 

In future work, we wish to support I/O; our proof technique appears to 
have all the necessary ingredients for that. Moreover, we wish to investigate a 
generalization of our concurrency model to an X10-like [30,42] setting where 
instead of one scheduler, we have several coarse-grained scheduling partitions. 
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Abstract. We give a rigorous characterization of what it means for a 
programming language to be memory safe, capturing the intuition that 
memory safety supports local reasoning about state. We formalize this 
principle in two ways. First, we show how a small memory-safe language 
validates a noninterference property: a program can neither affect nor be 
affected by unreachable parts of the state. Second, we extend separation 
logic, a proof system for heap-manipulating programs, with a “memory- 
safe variant” of its frame rule. The new rule is stronger because it applies 
even when parts of the program are buggy or malicious, but also weaker 
because it demands a stricter form of separation between parts of the pro- 
gram state. We also consider a number of pragmatically motivated vari- 
ations on memory safety and the reasoning principles they support. As 
an application of our characterization, we evaluate the security of a pre- 
viously proposed dynamic monitor for memory safety of heap-allocated 
data. 


1 Introduction 


Memory safety, and the vulnerabilities that follow from its absence [43], are 
common concerns. So what is it, exactly? Intuitions abound, but translating 
them into satisfying formal definitions is surprisingly difficult [20]. 

In large part, this difficulty stems from the prominent role that informal, 
everyday intuition assigns, in discussions of memory safety, to a range of errors 
related to memory misuse—buffer overruns, double frees, etc. Characterizing 
memory safety in terms of the absence of these errors is tempting, but this 
falls short for two reasons. First, there is often disagreement on which behaviors 
qualify as errors. For example, many real-world C programs intentionally rely 
on unrestricted pointer arithmetic [28], though it may yield undefined behavior 
according to the language standard [21, Sect. 6.5.6]. Second, from the perspective 
of security, the critical issue is not the errors themselves, but rather the fact that, 
when they occur in unsafe languages like C, the program’s ensuing behavior is 
determined by obscure, low-level factors such as the compiler’s choice of run- 
time memory layout, often leading to exploitable vulnerabilities. By contrast, in 
memory-safe languages like Java, programs can attempt to access arrays out of 
bounds, but such mistakes lead to sensible, predictable outcomes. 


© The Author(s) 2018 
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Rather than attempting a definition in terms of bad things that cannot hap- 
pen, we aim to formalize memory safety in terms of reasoning principles that 
programmers can soundly apply in its presence (or conversely, principles that 
programmers should not naively apply in unsafe settings, because doing so can 
lead to serious bugs and vulnerabilities). Specifically, to give an account of mem- 
ory safety, as opposed to more inclusive terms such as “type safety,” we focus on 
reasoning principles that are common to a wide range of stateful abstractions, 
such as records, tagged or untagged unions, local variables, closures, arrays, call 
stacks, objects, compartments, and address spaces. 

What sort of reasoning principles? Our inspiration comes from separation 
logic [36], a variant of Hoare logic designed to verify complex heap-manipulating 
programs. The power of separation logic stems from local reasoning about state: 
to prove the correctness of a program component, we must argue that its memory 
accesses are confined to a footprint, a precise region demarcated by the specifi- 
cation. This discipline allows proofs to ignore regions outside of the footprint, 
while ensuring that arbitrary invariants for these regions are preserved during 
execution. 

The locality of separation logic is deeply linked to memory safety. Consider a 
hypothetical jpeg decoding procedure that manipulates image buffers. We might 
expect its execution not to interfere with the integrity of an unrelated window 
object in the program. We can formalize this requirement in separation logic by 
proving a specification that includes only the image buffers, but not the window, 
in the decoder’s footprint. Showing that the footprint is respected would amount 
to checking the bounds of individual buffer accesses, thus enforcing memory 
safety; conversely, if the decoder is not memory safe, a simple buffer overflow 
might suffice to tamper with the window object, thus violating locality and 
potentially paving the way to an attack. 

Our aim is to extend this line of reasoning beyond conventional separation 
logic, encompassing settings such as ML, Java, or Lisp that enforce memory 
safety automatically without requiring complete correctness proofs—which can 
be prohibitively expensive for large code bases, especially in the presence of third- 
party libraries or plugins over which we have little control. The key observation 
is that memory safety forces code to respect a natural footprint: the set of its 
reachable memory locations (reachable with respect to the variables it mentions). 
Suppose that the jpeg decoder above is written in Java. Though we may not 
know much about its input-output behavior, we can still assert that it cannot 
have any effect on the window object simply by replacing the detailed reasoning 
demanded by separation logic by a simple inaccessibility check. 

Our first contribution is to formalize local reasoning principles supported by 
an ideal notion of memory safety, using a simple language (Sect. 2) to ground our 
discussion. We show three results (Theorems 1, 3 and 4) that explain how the 
execution of a piece of code is affected by extending its initial heap. These results 
lead to a noninterference property (Corollary 1), ensuring that code cannot affect 
or be affected by unreachable memory. In Sect.3.3, we show how these results 
yield a variant of the frame rule of separation logic (Theorem 6), which embodies 
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its local reasoning capabilities. The two variants have complementary strengths 
and weaknesses: while the original rule applies to unsafe settings like C, but 
requires comprehensively verifying individual memory accesses, our variant does 
not require proving that every access is correct, but demands a stronger notion 
of separation between memory regions. These results have been verified with the 
Coq proof assistant. 1 

Our second contribution (Sect.4) is to evaluate pragmatically motivated 
relaxations of the ideal notion above, exploring various trade-offs between safety, 
performance, flexibility, and backwards compatibility. These variants can be 
broadly classified into two groups according to reasoning principles they sup- 
port. The stronger group gives up on some secrecy guarantees, but still ensures 
that pieces of code cannot modify the contents of unreachable parts of the heap. 
The weaker group, on the other hand, leaves gaps that completely invalidate 
reachability-based reasoning. 

Our third contribution (Sect.5) is to demonstrate how our characterization 
applies to more realistic settings, by analyzing a heap-safety monitor for machine 
code [5,15]. We prove that the abstract machine that it implements also satisfies a 
noninterference property, which can be transferred to the monitor via refinement, 
modulo memory exhaustion issues discussed in Sect. 4. These proofs are also done 
in Coq.” 

We discuss related work on memory safety and stronger reasoning principles 
in Sect.6, and conclude in Sect.7. While memory safety has seen prior formal 
investigation (e.g. [31,41]), our characterization is the first phrased in terms of 
reasoning principles that are valid when memory safety is enforced automat- 
ically. We hope that these principles can serve as good criteria for formally 
evaluating such enforcement mechanisms in practice. Moreover, our definition is 
self-contained and does not rely on additional features such as full-blown capabil- 
ities, objects, module systems, etc. Since these features tend to depend on some 
form of memory safety anyway, we could see our characterization as a common 
core of reasoning principles that underpin all of them. 


2 An Idealized Memory-Safe Language 


Our discussion begins with a concrete case study: a simple imperative language 
with manual memory management. It features several mechanisms for control- 
ling the effects of memory misuse, ranging from the most conventional, such as 
bounds checking for spatial safety, to more uncommon ones, such as assigning 
unique identifiers to every allocated block for ensuring temporal safety. 
Choosing a language with manual memory management may seem odd, since 
safety is often associated with garbage collection. We made this choice for two 
reasons. First, most discussions on memory safety are motivated by its absence 
from languages like C that also rely on manual memory management. There is 


1 The proofs are available at: https://github.com/arthuraa/memory-safe-language. 
? Available at https://github.com/micro-policies/micro-policies-coq/tree/master / 
memory-safety. 
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Command |Description sESELLXxXM (states) 
x < e|Local assignment A 
z + [e] Read from heap le L£ var fn V (local stores) 
[e1] + e2|Heap assignment meMSIxZ—anV (heaps) 
# + alloc(ésize)|/ Allocation ve VAZYBw {nil} WI x Z (values) 
free(e)|Deallocation 
skip|Do nothing O £ Sw {error} (outcomes) 


if e then cı else c2|Conditional 
while e do c end|Loop 


; I = a countably infinite set 
C1; C2|Sequencing 


X San Y £ finite partial functions X — Y 


Fig. 1. Syntax, states and values 


a vast body of research that tries to make such languages safer, and we would 
like our account to apply to it. Second, we wanted to stress that our charac- 
terization does not depend fundamentally on the mechanisms used to enforce 
memory safety, especially because they might have complementary advantages 
and shortcomings. For example, manual memory management can lead to more 
memory leaks; garbage collectors can degrade performance; and specialized type 
systems for managing memory [37,41] are more complex. After a brief overview 
of the language, we explore its reasoning principles in Sect. 3. 

Figure 1 summarizes the language syntax and other basic definitions. Expres- 
sions e include variables x € var, numbers n € Z, booleans b € B, an invalid 
pointer nil, and various operations, both binary (arithmetic, logic, etc.) and 
unary (extracting the offset of a pointer). We write |e] for dereferencing the 
pointer denoted by e. 

Programs operate on states consisting of two components: a local store, which 
maps variables to values, and a heap, which maps pointers to values. Pointers 
are not bare integers, but rather pairs (i,n) of a block identifier i € I and an 
offset n € Z. The offset is relative to the corresponding block, and the identifier 
i need not bear any direct relation to the physical address that might be used in 
a concrete implementation on a conventional machine. (That is, we can equiva- 
lently think of the heap as mapping each identifier to a separate array of heap 
cells.) Similar structured memory models are widely used in the literature, as in 
the CompCert verified C compiler [26] and other models of the C language [23], 
for instance. 

We write [c](s) to denote the outcome of running a program c in an initial 
state s, which can be either a successful final state s’ or a fatal run-time error. 
Note that [c] is partial, to account for non-termination. Similarly, [e](s) denotes 
the result of evaluating the expression e on the state s (expression evaluation is 
total and has no side effects). The formal definition of these functions is left to 
the Appendix; we just single out a few aspects that have a crucial effect on the 
security properties discussed later. 
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Illegal Memory Accesses Lead to Errors. The language controls the effect of 
memory misuse by raising errors that stop execution immediately. This con- 
trasts with typical C implementations, where such errors lead to unpredictable 
undefined behavior. The main errors are caused by reads, writes, and frees to the 
current memory m using invalid pointers—that is, pointers p such that m(p) is 
undefined. Such pointers typically arise by offsetting an existing pointer out of 
bounds or by freeing a structure on the heap (which turns all other pointers to 
that block in the program state into dangling ones). In common parlance, this 
discipline ensures both spatial and temporal memory safety. 


Block Identifiers are Capabilities. Pointers can only be used to access memory 
corresponding to their identifiers, which effectively act as capabilities. Identifiers 
are set at allocation time, where they are chosen to be fresh with respect to the 
entire current state (i.e., the new identifier is not associated with any pointers 
defined in the current memory, stored in local variables, or stored on the heap). 
Once assigned, identifiers are immutable, making it impossible to fabricate a 
pointer to an allocated block out of thin air. This can be seen, for instance, in 
the semantics of addition, which allows pointer arithmetic but does not affect 
identifiers: 


nı + nə if [e;](s) = ni 
[er + e2](s) = ¢ (i nı +n2) if [e1](s) = (i, nı) and [e2](s) = ne 
nil otherwise 


For simplicity, nonsensical combinations such as adding two pointers simply 
result in the nil value. A real implementation might represent identifiers with 
hardware tags and use an increasing counter to generate identifiers for new blocks 
(as done by Dhawan et al. [15]; see Sect. 5.1); if enough tags are available, every 
identifier will be fresh. 


Block Identifiers Cannot be Observed. Because of the freshness condition above, 
identifiers can reveal information about the entire program state. For example, 
if they are chosen according to an increasing counter, knowing what identifier 
was assigned to a new block tells us how many allocations have been performed. 
A concrete implementation would face similar issues related to the choice of 
physical addresses for new allocations. (Such issues are commonplace in systems 
that combine dynamic allocation and information-flow control [12].) For this 
reason, our language keeps identifiers opaque and inaccessible to programs; they 
can only be used to reference values in memory, and nothing else. We discuss a 
more permissive approach in Sect. 4.2. 

Note that hiding identifiers doesn’t mean we have to hide everything asso- 
ciated with a pointer: besides using pointers to access memory, programs can 
also safely extract their offsets and test if two pointers are equal (which means 
equality for both offsets and identifiers). Our Coq development also shows that 
it is sound to compute the size of a memory block via a valid pointer. 
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New Memory is Always Initialized. Whenever a memory block is allocated, all 
of its contents are initialized to 0. (The exact value does not matter, as long it is 
some constant that is not a previously allocated pointer.) This is important for 
ensuring that allocation does not leak secrets present in previously freed blocks; 
we return to this point in Sect. 4.3. 


3 Reasoning with Memory Safety 


Having presented our language, we now turn to the reasoning principles that it 
supports. Intuitively, these principles allow us to analyze the effect of a piece of 
code by restricting our attention to a smaller portion of the program state. A first 
set of frame theorems (1, 3, and 4) describes how the execution of a piece of code 
is affected by extending the initial state on which it runs. These in turn imply 
a noninterference property, Corollary 1, guaranteeing that program execution is 
independent of inaccessible memory regions—that is, those that correspond to 
block identifiers that a piece of code does not possess. Finally, in Sect. 3.3, we 
discuss how the frame theorems can be recast in the language of separation logic, 
leading to a new variant of its frame rule (Theorem 6). 


(li, m1) U (l2, m2) £ (L U lo, mı U m2) (state union) 


(f Ug)(a) ê ee if x € dom(f) 


g(x) otherwise 
blocks(l, m) £ {i € I | 3n, (i, n) € dom(m)} (identifiers of live blocks) 
ids(l, m) = blocks(l, m) (all identifiers in state) 
U {i | da, n, I(x) = (i,n)} 
U {i | 3p, n, m(p) = (i, n)} 
vars(l, m)  dom(l) (defined local variables) 


(partial function union) 


vars(c) £ local variables of program c 
X#Y 2 (XY =9) (disjoint sets) 


A r š s € 
T s = rename identifiers with permutation 7 


Fig. 2. Basic notation 


3.1 Basic Properties of Memory Safety 


Figure 2 summarizes basic notation used in our results. By permutation, we mean 
a function 7 : I — I that has a two-sided inverse m~t; that is, ron! = mtor = 


idr. Some of these operations are standard and omitted for brevity.” 


3 The renaming operation 7 - s, in particular, can be derived formally by viewing S 
as a nominal set over I [34] obtained by combining products, disjoint unions, and 
partial functions. 
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The first frame theorem states that, if a program terminates successfully, 
then we can extend its initial state almost without affecting execution. 


Theorem 1 (Frame OK). Let c be a command, and sı, s4, and sz be states. 
Suppose that [c](s1) = s,, vars(c) C vars(s1), and blocks(s1) # blocks(s2). Then 
there exists a permutation m such that [c](s1Us2) = T:s1Us2 and blocks(7-s) # 
blocks(s2). 


The second premise, vars(c) C vars(s,), guarantees that all the variables needed 
to run c are already defined in s1, implying that their values do not change once 
we extend that initial state with s2. The third premise, blocks(s1) # blocks(s2), 
means that the memories of sı and s2 store disjoint regions. Finally, the conclu- 
sion of the theorem states that (1) the execution of c does not affect the extra 
state s2 and (2) the rest of the result is almost the same as s}, except for a 
permutation of block identifiers. 

Permutations are needed to avoid clashes between identifiers in s2 and those 
assigned to regions allocated by c when running on s1. For instance, suppose that 
the execution of c on sı allocated a new block, and that this block was assigned 
some identifier i € I. If the memory of s2 already had a block corresponding to 
i, c would have to choose a different identifier i’ for allocating that block when 
running on sı U sg. This change requires replacing all occurrences of i by i’ in 
the result of the first execution, which can be achieved with a permutation that 
swaps these two identifiers. 

The proof of Theorem 1 relies crucially on the facts that programs cannot 
inspect identifiers, that memory can grow indefinitely (a common assumption in 
formal models of memory), and that memory operations fail on invalid pointers. 
Because of the permutations, we also need to show that permuting the initial 
state s of a command c with any permutation 7 yields the same outcome, up to 
some additional permutation 7’ that again accounts for different choices of fresh 
identifiers. 


Theorem 2 (Renaming states). Let s be a state, c a command, and T a 
permutation. There exists n’ such that: 


error if [c](s) = error 
[cl(m-s)= 41 if [(s) = L 


m'en- s if [c(s) =s' 


A similar line of reasoning yields a second frame theorem, which says that 
we cannot make a program terminate just by extending its initial state. 


Theorem 3 (Frame Loop). Let c be a command, and sı and sz be states. 
If [c](s1) = L, vars(c) C vars(s,), and blocks(s1) # blocks(s2), then [c](s1 U 
s2) = L, 


The third frame theorem shows that extending the initial state also preserves 
erroneous executions. Its statement is similar to the previous ones, but with 
a subtle twist. In general, by extending the state of a program with a block, 
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we might turn an erroneous execution into a successful one—if the error was 
caused by accessing a pointer whose identifier matches that new block. To avoid 
this, we need a different premise (ids(s1) # blocks(s2)) preventing any pointers 
in the original state sı from referencing the new blocks in sg—which is only 
useful because our language prevents programs from forging pointers to existing 
regions. Since blocks(s) C ids(s), this premise is stronger than the analogous 
ones in the preceding results. 


Theorem 4 (Frame Error). Let c be a command, and sı and sq be states. If 
[c](s1) = error, vars(c) C vars(s1), and ids(s,) # blocks(s2), then [c](s1 U s2) = 
error. 


3.2 Memory Safety and Noninterference 


The consequences of memory safety analyzed so far are intimately tied to the 
notion of noninterference [19]. In its most widely understood sense, noninterfer- 
ence is a secrecy guarantee: varying secret inputs has no effect on public outputs. 
Sometimes, however, it is also used to describe integrity guarantees: low-integrity 
inputs have no effect on high-integrity outputs. In fact, both guarantees apply 
to unreachable memory in our language, since they do not affect code execution; 
that is, execution (1) cannot modify these inaccessible regions (preserving their 
integrity), and (2) cannot learn anything meaningful about them, not even their 
presence (preserving their secrecy). 


Corollary 1 (Noninterference). Let sı, S21, and 822 be states and c be a 
command. Suppose that vars(c) C vars(s;), that ids(s1) # blocks(s21) and that 
ids(s1) # blocks(s22). When running c on the extended states sı U s21 and sı U 
$22, only one of the following three possibilities holds: (1) both executions loop 
([e](s1 U s21) = [c](s1 U s22) = L); (2) both executions terminate with an error 
([e](s1Us21) = [c](s1Us22) = error); or (3) both executions successfully terminate 
without interfering with the inaccessible portions s21 and 822 (formally, there 
exists a state s and permutations Tı and Ta such that [c](s1 U s2;) = Ti- 51 U Soi 
and ids(7; - s1) # blocks(s2;), for i = 1,2). 


Noninterference is often formulated using an indistinguishability relation on 
states, which expresses that one state can be obtained from the other by vary- 
ing its secrets. We could have equivalently phrased the above result in a similar 
way. Recall that the hypothesis ids(s;) # blocks(s2) means that memory regions 
stored in sg are unreachable via sı. Then, we could call two states “indistinguish- 
able” if the reachable portions are the same (except for a possible permutation). 
In Sect. 4, the connection with noninterference will provide a good benchmark 
for comparing different flavors of memory safety. 


3.3 Memory Safety and Separation Logic 


We now explore the relation between the principles identified above, espe- 
cially regarding integrity, and the local reasoning facilities of separation logic. 
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Separation logic targets specifications of the form {p} c {q}, where p and q 
are predicates over program states (subsets of S). For our language, this could 
roughly mean 


Vs € p,vars(c) C vars(s) = [[c](s) € qU {L}. 


That is, if we run cin a state satisfying p, it will either diverge or terminate in a 
state that satisfies q, but it will not trigger an error. Part of the motivation for 
precluding errors is that in unsafe settings like C they yield undefined behavior, 
destroying all hope of verification. 

Local reasoning in separation logic is embodied by the frame rule, a conse- 
quence of Theorems 1 and 3. Roughly, it says that a verified program can only 
affect a well-defined portion of the state, with all other memory regions left 
untouched.* 


Theorem 5. Let p, q, andr be predicates over states and c be a command. The 
rule 


independent(r, modvars(c)) {p} c {q} 


{p*r} c{q*r} 


is sound, where modvars(c) is the set of local variables modified by c, 
independent(r, V) means that the assertion r does not depend on the set of local 
variables V 


FRAME 


Vii la m, (Va € V, L(x) = l(x)) > (l,m) Er > (l,m) Er, 
and p*r denotes the separating conjunction of p and r: 
{(l,m1 U m2) | (l, m1) € p, (l, m2) € r, blocks(l, mı) # blocks(l, m2) }. 


As useful as it is, precluding errors during execution makes it difficult to use 
separation logic for partial verification: proving any property, no matter how 
simple, of a nontrivial program requires detailed reasoning about its internals. 
Even the following vacuous rule is unsound in separation logic: 


Dremm 


For a counterexample, take p to be true and c to be some arbitrary memory read 
x — [y]. If we run c on an empty heap, which trivially satisfies the precondition, 
we obtain an error, contradicting the specification. 

Fortunately, our memory-safe language—in which errors have a sensible, pre- 
dictable semantics, as opposed to wild undefined behavior—supports a variant of 
separation logic that allows looser specifications of the form {p} c {q}e, defined as 


Vs € p,vars(c) C vars(s) = [c] (s) € q U {L, error}. 


4 Technically, the frame rule requires a slightly stronger notion of specification, 
accounting for permutations of allocated identifiers; our Coq development has a 
more precise statement. 
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These specifications are weaker than their conventional counterparts, leading 
to a subsumption rule: 


{p} c {a} 
{p} c {q}e 


Because errors are no longer prevented, the TAUT rule {p} c {true}. becomes 
sound, since the true postcondition now means that any outcome whatsoever 
is acceptable. Unfortunately, there is a price to pay for allowing errors: they 
compromise the soundness of the frame rule. The reason, as hinted in the intro- 
duction, is that preventing run-time errors has an additional purpose in separa- 
tion logic: it forces programs to act locally—that is, to access only the memory 
delimited their pre- and postconditions. To see why, consider the same program 
c as above, x — [y]. This program clearly yields an error when run on an empty 
heap, implying that the triple {emp} c {a = 0}e is valid, where the predicate 
emp holds of any state with an empty heap and x = 0 holds of states whose 
local store maps x to 0. Now consider what happens if we try to apply an analog 
of the frame rule to this triple using the frame predicate y +> 1, which holds in 
states where y contains a pointer to the unique defined location on the heap, 
which stores the value 1. After some simplification, we arrive at the specification 
{y= 1} c{x=0Ay 1}e, which clearly does not hold, since executing c on a 
state satisfying the precondition leads to a successful final state mapping z to 1. 

For the frame rule to be recovered, it needs to take errors into account. The 
solution lies on the reachability properties of memory safety: instead of enforcing 
locality by preventing errors, we can use the fact that memory operations in a 
safe language are automatically local—in particular, local to the identifiers that 
the program possesses. 


Theorem 6. Under the same assumptions as Theorem 5, the following rule is 
sound 
independent(r, modvars(c)) {p} c {q}e 


SAFEFRAME 
{pèr} c {qbr}e 


where p> r denotes the isolating conjunction of p and r, defined as 
{0, my U mz) | (1, mı) E p, (1, m2) er, ids(J, m1) # blocks(/, M2) }. 


The proof is similar to the one for the original rule, but it relies additionally 
on Theorem 4. This explains why the isolating conjunction is needed, since it 
ensures that the fragment satisfying r is unreachable from the rest of the state. 


3.4 Discussion 


As hinted by their connection with the frame rule, the theorems of Sect. 3.1 are 
a form of local reasoning: to reason about a command, it suffices to consider its 


The Meaning of Memory Safety 89 


reachable state; how this state is used bears no effect on the unreachable por- 
tions. In a more realistic language, reachability might be inferred from additional 
information such as typing. But even here it can probably be accomplished by 
a simple check of the program text. 

For example, consider the hypothetical jpeg decoder from Sect. 1. We would 
like to guarantee that the decoder cannot tamper with an unreachable object—a 
window object, a whitelist of trusted websites, etc. The frame theorems give us 
a means to do so, provided that we are able to show that the object is indeed 
unreachable; additionally, they imply that the jpeg decoder cannot directly 
extract any information from this unreachable object, such as passwords or pri- 
vate keys. 

Many real-world attacks involve direct violations of these reasoning princi- 
ples. For example, consider the infamous Heartbleed attack on OpenSSL, which 
used out-of-bounds reads from a buffer to leak data from completely unrelated 
parts of the program state and to steal sensitive information [16]. Given that 
the code fragment that enabled that attack was just manipulating an innocuous 
array, a programmer could easily be fooled into believing (as probably many 
have) that that snippet could not possibly access sensitive information, allowing 
that vulnerability to remain unnoticed for years. 

Finally, our new frame rule only captures the fact that a command cannot 
influence the heap locations that it cannot reach, while our noninterference result 
(Corollary 1) captures not just this integrity aspect of memory safety, but also a 
secrecy aspect. We hope that future research will explore the connection between 
the secrecy aspect of memory safety and (relational) program logics. 


4 Relaxing Memory Safety 


So much for formalism. What about reality? Strictly speaking, the security prop- 
erties we have identified do not hold of any real system. This is partly due 
to fundamental physical limitations—real systems run with finite memory, and 
interact with users in various ways that transcend inputs and outputs, notably 
through time and other side channels. A more interesting reason is that real 
systems typically do not impose all the restrictions required for the proofs of 
these properties. Languages that aim for safety generally offer relatively benign 
glimpses of their implementation details (such accessing the contents of unini- 
tialized memory, extract physical addresses from pointers or compare them for 
ordering) in return for significant flexibility or performance gains. In other sys- 
tems, the concessions are more fundamental, to the extent that it is harder to 
clearly delimit what part of a program is unsafe: the SoftBound transforma- 
tion [31], for example, adds bounds checks for C programs, but does not pro- 
tect against memory-management bugs; a related transformation, CETS [32], is 
required for temporal safety. 


5 Though the attacker model considered in this paper does not try to address such 
side-channel attacks, one should be able to use the previous research on the subject 
to protect against them or limit the damage they can cause [6, 39,40, 49]. 
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In this section, we enumerate common relaxed models of memory safety and 
evaluate how they affect the reasoning principles and security guarantees of 
Sect.3. Some relaxations, such as allowing pointers to be forged out of thin 
air, completely give up on reachability-based reasoning. Others, however, retain 
strong guarantees for integrity while giving up on some secrecy, allowing aspects 
of the global state of a program to be observed. For example, a system with finite 
memory (Sect. 4.5) may leak some information about its memory consumption, 
and a system that allows pointer-to-integer casts (Sect. 4.2) may leak information 
about its memory layout. Naturally, the distinction between integrity and secrecy 
should be taken with a grain of salt, since the former often depends on the latter; 
for example, if a system grants privileges to access some component when given 
with the right password, a secrecy violation can escalate to an integrity violation! 


4.1 Forging Pointers 


Many real-world C programs use integers as pointers. If this idiom is allowed 
without restrictions, then local reasoning is compromised, as every memory 
region may be reached from anywhere in the program. It is not surprising that 
languages that strive for safety either forbid this kind of pointer forging or con- 
fine it to clear unsafe fragments. 

More insidiously, and perhaps surprisingly, similar dangers lurk in the state- 
ful abstractions of some systems that are widely regarded as “memory safe.” 
JavaScript, for example, allows code to access arbitrary global variables by 
indexing an associative array with a string, a feature that enables many seri- 
ous attacks [1,18,29,44]. One might argue that global variables in JavaScript 
are “memory unsafe” because they fail to validate local reasoning: even if part 
of a JavaScript program does not explicitly mention a given global variable, it 
might still change this variable or the objects it points to. Re-enabling local 
reasoning requires strong restrictions on the programming style [1,9,18]. 


4.2 Observing Pointers 


The language of Sect.2 maintains a complete separation between pointers and 
other values. In reality, this separation is often only enforced in one direction. 
For example, some tools for enforcing memory safety in C [13,31] allow pointer- 
to-integer casts [23] (a feature required by many low-level idioms [10,28]); and 
the default implementation of hashCode() in Java leaks address information. 
To model such features, we can extend the syntax of expressions with a form 
cast(e), the semantics of which are defined with some function [cast] : 1x Z > Z 
for converting a pointer to an integer: 


[cast(e)](s) = [cast] ([e](s)) if [eJ(s) € Ix Z 


Note that the original language included an operator for extracting the offset 
of a pointer. Their definitions are similar, but have crucially different conse- 
quences: while offsets do not depend on the identifier, allocation order, or other 
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low-level details of the language implementation (such as the choice of phys- 
ical addresses when allocating a block), all of these could be relevant when 
defining the semantics of cast. The three frame theorems (1, 3, and 4) are thus 
lost, because the state of unreachable parts of the heap may influence integers 
observed by the program. An important consequence is that secrecy is weakened 
in this language: an attacker could exploit pointers as a side-channel to learn 
secrets about data it shouldn’t access. 

Nevertheless, integrity is not affected: if a block is unreachable, its contents 
will not change at the end of the execution. (This result was also proved in Coq.) 


Theorem 7 (Integrity-only Noninterference). Let sı, 52, and s’ be states 
and c a command such that vars(c) C vars(s1), ids(s1) # blocks(s2), and [c](s1U 
s2) = 8’. Then we can find s} € S such that s! = sUs2 and ids(s‘,) # blocks(s2). 


The stronger noninterference result of Corollary 1 showed that, if pointer-to- 
integer casts are prohibited, changing the contents of the unreachable portion s2 
has no effect on the reachable portion, s}. In contrast, Theorem 7 allows changes 
in s2 to influence s in arbitrary ways in the presence of these casts: not only can 
the contents of this final state change, but the execution can also loop forever 
or terminate in an error. 

To see why, suppose that the jpeg decoder of Sect. 1 is part of a web browser, 
but that it does not have the required pointers to learn the address that the user 
is currently visiting. Suppose that there is some relation between the memory 
consumption of the program and that website, and that there is some correlation 
between the memory consumption and the identifier assigned to a new block. 
Then, by allocating a block and converting its pointer to a integer, the decoder 
might be able to infer useful information about the visited website [22]. Thus, 
if s2 denoted the part of the state where that location is stored, changing its 
contents would have a nontrivial effect on s1, the part of the state that the 
decoder does have access to. We could speculate that, in a reasonable system, this 
channel can only reveal information about the layout of unreachable regions, and 
not their contents. Indeed, we conjecture this for the language of this subsection. 

Finally, it is worth noting that simply excluding casts might not suffice to 
prevent this sort of vulnerability. Recall that our language takes both offsets 
and identifiers into account for equality tests. For performance reasons, we could 
have chosen a different design that only compares physical addresses, completely 
discarding identifiers. If attackers know the address of a pointer in the program— 
which could happen, for instance, if they have access to the code of the program 
and of the allocator—they can use pointer arithmetic (which is generally harm- 
less and allowed in our language) to find the address of other pointers. If z holds 
the pointer they control, they can run, for instance, 


y + alloc(1); if £ +1729 = y then ... else ..., 


to learn the location assigned to y and draw conclusions about the global state. 
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4.3 Uninitialized Memory 


Safe languages typically initialize new variables and objects. But this can degrade 
performance, leading to cases where this feature is dropped—including standard 
C implementations, safer alternatives [13,31], OCaml’s Bytes. create primitive, 
or Node.js’s Buffer.allocUnsafe, for example. 

The problem with this concession is that the entire memory becomes relevant 
to execution, and local reasoning becomes much harder. By inspecting old values 
living in uninitialized memory, an attacker can learn about parts of the state 
they shouldn’t access and violate secrecy. This issue would become even more 
severe in a system that allowed old pointers or other capabilities to occur in 
re-allocated memory in a way that the program can use, since they could yield 
access to restricted resources directly, leading to potential integrity violations as 
well. (The two examples given above—OCaml and Node.js—do not suffer from 
this issue, because any preexisting pointers in re-allocated memory are treated 
as bare bytes that cannot be used to access memory.) 


4.4 Dangling Pointers and Freshness 


Another crucial issue is the treatment of dangling pointers—references to pre- 
viously freed objects. Dangling pointers are problematic because there is an 
inherent tension between giving them a sensible semantics (for instance, one 
that validates the properties of Sect.3) and obtaining good performance and 
predictability. Languages with garbage collection avoid the issue by forbidding 
dangling pointers altogether—heap storage is freed only when it is unreachable. 
In the language of Sect. 2, besides giving a well-defined behavior to the use of 
dangling pointers (signaling an error), we imposed strong freshness requirements 
on allocation, mandating not only that the new identifier not correspond to any 
existing block, but also that it not be present anywhere else in the state. 

To see how the results of Sect. 3 are affected by weakening freshness, suppose 
we run the program x + alloc(1);z — (y = x) on a state where y holds a 
dangling pointer. Depending on the allocator and the state of the memory, the 
pointer assigned to x could be equal to y. Since this outcome depends on the 
entire state of the system, not just the reachable memory, Theorems 1, 3 and 4 
now fail. Furthermore, an attacker with detailed knowledge of the allocator could 
launder secret information by testing pointers for equality. Weakening freshness 
can also have integrity implications, since it becomes harder to ensure that blocks 
are properly isolated. For instance, a newly allocated block might be reachable 
through a dangling pointer controlled by an attacker, allowing them to access 
that block even if they were not supposed to. 

Some practical solutions for memory safety use mechanisms similar to our 
language’s, where each memory location is tagged with an identifier describ- 
ing the region it belongs to [11,15]. Pointers are tagged similarly, and when a 
pointer is used to access memory, a violation is detected if its identifier does not 
match the location’s. However, for performance reasons, the number of possible 
identifiers might be limited to a relatively small number, such as 2 or 4 [11] or 
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16 [46]. In addition to the problems above, since multiple live regions can share 
the same identifier in such schemes, it might be possible for buffer overflows to 
lead to violations of secrecy and integrity as well. 

Although we framed our discussion in terms of identifiers, the issue of fresh- 
ness can manifest itself in other ways. For example, many systems for spatial 
safety work by adding base and bounds information to pointers. In some of 
these [13,31], dangling pointers are treated as an orthogonal issue, and it is pos- 
sible for the allocator to return a new memory region that overlaps with the 
range of a dangling pointer, in which case the new region will not be properly 
isolated from the rest of the state. 

Finally, dangling pointers can have disastrous consequences for overall system 
security, independently of the freshness issues just described: freeing a pointer 
more than once can break allocator invariants, enabling attacks [43]. 


4.5 Infinite Memory 


Our idealized language allows memory to grow indefinitely. But real languages 
run on finite memory, and allocation fails when programs run out of space. 
Besides enabling denial-of-service attacks, finite memory has consequences for 
secrecy. Corollary 1 does not hold in a real programming language as is, because 
an increase in memory consumption can cause a previously successful allocation 
to fail. By noticing this difference, a piece of code might learn something about 
the entire state of the program. How problematic this is in practice will depend 
on the particular system under consideration. 

A potential solution is to force programs that run out of memory to terminate 
immediately. Though this choice might be bad from an availability standpoint, 
it is probably the most benign in terms of secrecy. We should be able to prove 
an error-insensitive variant of Corollary 1, where the only significant effect that 
unreachable memory can have is to turn a successful execution or infinite loop 
into an error. Similar issues arise for IFC mechanisms that often cannot prevent 
secrets from influencing program termination, leading to termination-insensitive 
notions of noninterference. 

Unfortunately, even an error-insensitive result might be too strong for real 
systems, which often make it possible for attackers to extract multiple bits of 
information about the global state of the program—as previously noted in the 
IFC literature [4]. Java, for example, does not force termination when memory 
runs out, but triggers an exception that can be caught and handled by user code, 
which is then free to record the event and probe the allocator with a different 
test. And most languages do not operate in batch mode like ours does, merely 
producing a single answer at the end of execution; rather, their programs con- 
tinuously interact with their environment through inputs and outputs, allowing 
them to communicate the exact amount of memory that caused an error. 

This discussion suggests that, if size vulnerabilities are a real concern, they 
need to be treated with special care. One approach would be to limit the amount 
of memory an untrusted component can allocate [47], so that exhausting the 
memory allotted to that component doesn’t reveal information about the state 
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of the rest of the system (and so that also global denial-of-service attacks are 
prevented). A more speculative idea is to develop quantitative versions [6,39] of 
the noninterference results discussed here that apply only if the total memory 
used by the program is below a certain limit. 


5 Case Study: A Memory-Safety Monitor 


To demonstrate the applicability of our characterization, we use it to analyze a 
tag-based monitor proposed by Dhawan et al. to enforce heap safety for low-level 
code [15]. In prior work [5], we and others showed that an idealized model of 
the monitor correctly implements a higher-level abstract machine with built-in 
memory safety—a bit more formally, every behavior of the monitor is also a 
behavior of the abstract machine. Building upon this work, we prove that this 
abstract machine satisfies a noninterference property similar to Corollary 1. We 
were also able to prove that a similar result holds for a lower-level machine that 
runs a so-called “symbolic” representation of the monitor—although we had to 
slightly weaken the result to account for memory exhaustion (cf. Sect. 4.5), since 
the machine that runs the monitor has finite memory, while the abstract machine 
has infinite memory. If we had a verified machine-code implementation of this 
monitor, it would be possible to prove a similar result for it as well. 


5.1 Tag-Based Monitor 


We content ourselves with a brief overview of Dhawan et al.’s monitor [5,15], 
since the formal statement of the reasoning principles it supports are more com- 
plex than the one for the abstract machine from Sect. 5.2, on which we will focus. 
Following a proposal by Clause et al. [11], Dhawan et al.’s monitor enforces mem- 
ory safety for heap-allocated data by checking and propagating metadata tags. 
Every memory location receives a tag that uniquely identifies the allocated region 
to which that location belongs (akin to the identifiers in Sect. 2), and pointers 
receive the tag of the region they are allowed to reference. The monitor assigns 
these tags to new regions by storing a monotonic counter in protected memory 
that is bumped on every call to malloc; with a large number of possible tags, it 
is possible to avoid the freshness pitfalls discussed in Sect. 4.4. When a memory 
access occurs, the monitor checks whether the tag on the pointer matches the tag 
on the location. If they do, the operation is allowed; otherwise, execution halts. 
The monitor instruments the allocator to make set up tags correctly. Its imple- 
mentation achieves good performance using the PUMP, a hardware extension 
accelerating such micro-policies for metadata tagging [15]. 


5.2 Abstract Machine 


The memory-safe abstract machine [5] operates on two kinds of values: machine 
words w, or pointers (i, w), which are pairs of an identifier i € I and an offset 
w. We use W to denote the set of machine words, and V to denote the set 
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of values. Machine states are triples (m, rs, pc), where (1) m € I —gn V* is a 
memory mapping identifiers to lists of values; (2) rs E R —sn V is a register 
bank, mapping register names to values; and (3) pe € V is the program counter. 

The execution of an instruction is specified by a step relation s — s’. If there 
is no s’ such that s — s’, we say that s is stuck, which means that a fatal 
error occurred during execution. On each instruction, the machine checks if the 
current program counter is a pointer and, if so, tries to fetch the corresponding 
value in memory. The machine then ensures that this value is a word that cor- 
rectly encodes an instruction and, if so, acts accordingly. The instructions of the 
machine, representative of typical RISC architectures, allow programs to perform 
binary and logical operations, move values to and from memory, and branch. The 
machine is in fact fairly similar to the language of Sect.2. Some operations are 
overloaded to manipulate pointers; for example, adding a pointer to a word is 
allowed, and the result is obtained by adjusting the pointer’s offset accordingly. 
Accessing memory causes the machine to halt when the corresponding position 
is undefined. 

In addition to these basic instructions, the machine possesses a set of special 
monitor services that can be invoked as regular functions, using registers to 
pass in arguments and return values. There are two services alloc and free for 
managing memory, and one service eq for testing whether two values are equal. 
The reason for using separate monitor services instead of special instructions 
is to keep its semantics closer to the more concrete machine that implements 
it. While instructions include an equality test, it cannot replace the eq service, 
since it only takes physical addresses into account. As argued in Sect. 4.2, such 
comparisons can be turned into a side channel. To prevent this, testing two 
pointers for equality directly using the corresponding machine instruction results 
in an error if the pointers have different block identifiers. 


5.3 Verifying Memory Safety 


The proof of memory safety for this abstract machine mimics the one carried for 
the language in Sect. 3. We use similar notations as before: m s means renaming 
every identifier that appears in s according to the permutation 7, and ids(s) is 
the finite set of all identifiers that appear in the state s. A simple case analysis on 
the possible instructions yields analogs of Theorems 1, 2 and 4 (we don’t include 
an analog of Theorem3 because we consider individual execution steps, where 
loops cannot occur). 


Theorem 8. Let 7 be a permutation, and s and s’ be two machine states such 
that s > s'. There exists another permutation n’ such that T s > n+ 8’. 


Theorem 9. Let (m,,1rs,pc) be a state of the abstract machine, and mo a 
memory. Suppose that ids(mi,rs,pc) # dom(mg2), and that (mı, rs, pc) —> 
(m’, rs’, pc’). Then, there exists a permutation m such that ids(a-m’, n-rs, n-pc) # 
dom(mz) and (mz U M1, rs, pc) > (Ma UT: M, m- rs’, m+ pe’). 
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Theorem 10. Let (mı,rs,pc) be a machine state, and mz a memory. If 
ids(m1, rs, pc) # dom(m2), and (mı, rs, pc) is stuck, then (m2 U mı, rs, pc) is 
also stuck. 


Once again, we can combine these properties to obtain a proof of noninter- 
ference. Our Coq development includes a complete statement. 


5.4 Discussion 


The reasoning principles supported by the memory-safety monitor have an 
important difference compared to the ones of Sect. 3. In the memory-safe lan- 
guage, reachability is relative to a program’s local variables. If we want to argue 
that part of the state is isolated from some code fragment, we just have to con- 
sider that fragment’s local variables—other parts of the program are still allowed 
to access the region. The memory-safety monitor, on the other hand, does not 
have an analogous notion: an unreachable memory region is useless, since it 
remains unreachable by all components forever. 

It seems that, from the standpoint of noninterference, heap memory safety 
taken in isolation is much weaker than the guarantees it provides in the presence 
of other language features, such as local variables. Nevertheless, the properties 
studied above suggest several avenues for strengthening the mechanism and mak- 
ing its guarantees more useful. The most obvious one would be to use the mech- 
anism as the target of a compiler for a programming language that provides 
other (safe) stateful abstractions, such as variables and a stack for procedure 
calls. A more modest approach would be to add other state abstractions to the 
mechanism itself. Besides variables and call stacks, if the mechanism made code 
immutable and separate from data, a simple check would suffice to tell whether 
a code segment stored in memory references a given privileged register. If the 
register is the only means of reaching a memory region, we should be able to 
soundly infer that that code segment is independent of that region. 

On a last note, although the abstract machine we verified is fairly close to our 
original language, the dynamic monitor that implements it using tags is quite 
different (Sect.5.1). In particular, the monitor works on a machine that has a 
flat memory model, and keeps track of free and allocated memory using a pro- 
tected data structure that stores block metadata. It was claimed that reasoning 
about this base and bounds information was the most challenging part of the 
proof that the monitor implements the abstract machine [5]. For this reason, we 
believe that this proof can be adapted to other enforcement mechanisms that 
rely solely on base and bounds information—for example, fat pointers [13,25] or 
SoftBound [31]—while keeping a similar abstract machine as their specification, 
and thus satisfying a similar noninterference property. This gives us confidence 
that our memory safety characterization generalizes to other settings. 


6 Related Work 


The present work lies at the intersection of two areas of previous research: one 
on formal characterizations of memory safety, the other on reasoning principles 
for programs. We review the most closely related work in these areas. 
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Characterizing Memory Safety. Many formal characterizations of memory safety 
originated in attempts to reconcile its benefits with low-level code. Gener- 
ally, these works claim that a mechanism is safe by showing that it prevents 
or catches typical temporal and spatial violations. Examples in the literature 
include: Cyclone [41], a language with a type system for safe manual memory 
management; CCured [33], a program transformation that adds temporal safety 
to C by refining its pointer type with various degrees of safety; Ivory [17] an 
embedding of a similar “safe-C variant” into Haskell; SoftBound [31], an instru- 
mentation technique for C programs for spatial safety, including the detection of 
bounds violations within an object; CETS [32], a compiler pass for preventing 
temporal safety violations in C programs, including accessing dangling point- 
ers into freed heap regions and stale stack frames; the memory-safety monitor 
for the PUMP [5,15], which formed the basis of our case study in Sect. 5; and 
languages like Mezzo [35] and Rust [45], whose guarantees extend to prevent- 
ing data races [7]. Similar models appear in formalizations of C [24,26], which 
need to rigorously characterize its sources of undefined behavior—in particular, 
instances of memory misuse. 

Either explicitly or implicitly, these works define memory errors as attempts 
to use a pointer to access a location that it was not meant to access—for exam- 
ple, an out-of-bounds or free one. This was noted by Hicks [20], who, inspired by 
SoftBound, proposed to define memory safety as an execution model that tracks 
what part of memory each pointer can access. Our characterization is comple- 
mentary to these accounts, in that it is extensional: its data isolation properties 
allow us to reason directly about the observable behavior of the program. Fur- 
thermore, as demonstrated by our application to the monitor of Sect.5 and the 
discussions on Sect. 4, it can be adapted to various enforcement mechanisms and 
variations of memory safety. 


Reasoning Principles. Separation logic [36,48] has been an important source of 
inspiration for our work. The logic’s frame rule enables its local reasoning capa- 
bilities and imposes restrictions that are similar to those mandated by memory- 
safe programming guidelines. As discussed in Sect. 3.3, our reasoning principles 
are reminiscent of the frame rule, but use reachability to guarantee locality in 
settings where memory safety is enforced automatically. In separation logic, by 
contrast, locality needs to be guaranteed for each program individually by com- 
prehensive proofs. 

Several works have investigated similar reasoning principles for a variety of 
program analyses, including static, dynamic, manual, or a mixture of those. Some 
of these are formulated as expressive logical relations, guaranteeing that pro- 
grams are compatible with the framing of state invariants; representative works 
include: L? [3], a linear calculus featuring strong updates and aliasing control; 
the work of Benton and Tabereau [8] on a compiler for a higher-order language; 
and the work of Devriese et al. [14] on object capabilities for a JavaScript-like 
language. Other developments are based on proof systems reminiscent of sep- 
aration logic; these include Yarra [38], an extension of C that allows program- 
mers to protect the integrity of data structures marked as critical; the work 
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of Agten et al. [2], which allows mixing unverified and verified components by 
instrumenting the program to check that required assertions hold at interfaces; 
and the logic of Swasey et al. [42] for reasoning about object capabilities. 

Unlike our work, these developments do not propose reachability-based isola- 
tion as a general definition of memory safety, nor do they attempt to analyze how 
their reasoning principles are affected by common variants of memory safety. Fur- 
thermore, many of these other works—especially the logical relations—rely on 
encapsulation mechanisms such as closures, objects, or modules that go beyond 
plain memory safety. Memory safety alone can only provide complete isolation, 
while encapsulation provides finer control, allowing some interaction between 
components, while guaranteeing the preservation of certain state invariants. In 
this sense, one can see memory-safety reasoning as a special case of encapsulation 
reasoning. Nevertheless, it is a practically relevant special case that is interesting 
on its own, since when reasoning about an encapsulated component, one must 
argue explicitly that the invariants of interest are preserved by the private oper- 
ations of that component; memory safety, on the other hand, guarantees that 
any invariant on unreachable parts of the memory is automatically preserved. 

Perhaps closer to our work, Maffeis et al. [27] show that their notion of 
“authority safety” guarantees isolation, in the sense that a component’s actions 
cannot influence the actions of another component with disjoint authority. Their 
notion of authority behaves similarly to the set of block identifiers accessible by 
a program in our language; however, they do not attempt to connect their notion 
of isolation to the frame rule, noninterference, or traditional notions of memory 
safety. 

Morrisett et al. [30] state a correctness criterion for garbage collection based 
on program equivalence. Some of the properties they study are similar to the 
frame rule, describing the behavior of code running in an extended heap. How- 
ever, they use this analysis to justify the validity of deallocating objects, rather 
than studying the possible interactions between the extra state and the program 
in terms of integrity and secrecy. 


7 Conclusions and Future Work 


We have explored the consequences of memory safety for reasoning about pro- 
grams, formalizing intuitive principles that, we argue, capture the essential dis- 
tinction between memory-safe systems and memory-unsafe ones. We showed how 
the reasoning principles we identified apply to a recent dynamic monitor for heap 
memory safety. 

The systems studied in this paper have a simple storage model: the lan- 
guage of Sect.2 has just global variables and flat, heap-allocated arrays, while 
the monitor of Sect.5 doesn’t even have variables or immutable code. Realis- 
tic programming platforms, of course, offer much richer stateful abstractions, 
including, for example, procedures with stack-allocated local variables as well 
as structured objects with contiguously allocated sub-objects. In terms of mem- 
ory safety, these systems have a richer vocabulary for describing resources that 
programs can access, and programmers could benefit from isolation-based local 
reasoning involving these resources. 
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For example, in typical safe languages with procedures, the behavior of a 
procedure should depend only on its arguments, the global variables it uses, 
and the portions of the state that are reachable from these values; if the caller 
of that procedure has a private object that is not passed as an argument, it 
should not affect or be affected by the call. Additionally, languages such as C 
allow for objects consisting of contiguously allocated sub-objects for improved 
performance. Some systems for spatial safety [13,31] allow capability downgrad- 
ing—that is, narrowing the range of a pointer so that it can’t access outside of 
a sub-object’s bounds. It would be interesting to refine our model to take these 
features into account. In the case of the monitor of Sect.5, such considerations 
could lead to improved designs or to the integration of the monitor inside a 
secure compiler. Conversely, it would be interesting to derive finer security prop- 
erties for relaxations like the ones discussed in Sect.4. Some inspiration could 
come from the IFC literature, where quantitative noninterference results pro- 
vide bounds on the probability that some secret is leaked, the rate at which it 
is leaked, how many bits are leaked, etc. [6,39]. 

The main goal of this work was to understand, formally, the benefits of 
memory safety for informal and partial reasoning, and to evaluate a variety of 
weakened forms of memory safety in terms of which reasoning principles they 
preserve. However, our approach may also suggest ways to improve program 
verification. One promising idea is to leverage the guarantees of memory safety 
to obtain proofs of program correctness modulo unverified code that could have 
errors, in contexts where complete verification is too expensive or not possible 
(e.g., for programs with a plugin mechanism). 
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Appendix 


This appendix defines the language of Sect. 2 more formally. Figure 3 summarizes 
the syntax of programs and repeats the definition of program states. The syntax 
is standard for a simple imperative language with pointers. 

Figure 4 defines expression evaluation, [e] : S — V. Variables are looked 
up in the local-variable part of the state (for simplicity, heap cells cannot be 
dereferenced in expressions; the command x + |e] puts the value of a heap 
cell in a local variable). Constants (booleans, numbers, and the special value 
nil used to simplify error propagation) evaluate to themselves. Addition and 
subtraction can be applied both to numbers and to combinations of numbers 
and pointers (for pointer arithmetic); multiplication only works on numbers. 
Equality is allowed both on pointers and on numbers. Pointer equality compares 
both the block identifier and its offset, and while this is harder to implement in 
practice than just comparing physical addresses, this is needed for not leaking 
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@:=+ |x |—|=|<|and|or (operators) 

exn=axe€var|bEB|nEeZ (expressions) 
| e1 Beg | note | offsete | nil 

c= skip | c1; c2 (commands) 


| if e then cı else c2 

while e do c end 
z<e|ax< |e] | [e1] + e2 
x + alloc(e) | free(e) 


sESLLxXM (states) 
le LA var Sgn V (local stores) 
meM£IxZ—anV (heaps) 
veEVEZwBuw {nilbwIxZ (values) 
O Ê Sw {error} (outcomes) 


I Ê some countably infinite set 


X35, Y ê partial functions X — Y with finite domain 


Fig. 3. Syntax and program states 


information about pointers (see Sect. 4.2). The special expression offset extracts 
the offset component of a pointer; we introduce it to illustrate that for satisfying 
our memory characterization pointer offsets do not need to be hidden (as opposed 
to block identifiers). The less-than-or-equal operator only applies to numbers—in 
particular, pointers cannot be compared. However, since we can extract pointer 
offsets, we can compare those instead. 

The definition of command evaluation employs an auxiliary partial function 
that computes the result of evaluating a program along with the set of block 
identifiers that were allocated during evaluation. Formally, [c] : S > O4, 
where O is an extended set of outcomes defined as Prin (I) x S wW {error}. We 
then set 


(Vm) if [c]4(1,m) = (I, l,m’) 
[c](, m) = < error if [c] (l, m) = error 
L if [e] (l, m) = L 
finalids(1, m) = ea \E if [di (,m) = (Lm) 
o otherwise 


To define [c]+, we first endow the set S — O, with the partial order of 
program approximation: 


flg = Vs, f(s) # L= f(x) = g(2) 
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if x € dom(l) 


otherwise 
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nı + ne if Je1](s) = nı and fez] (s) = n2 
: esl(s) 2 (i nı +n2) if Jei](s) = (i, nı) and [e2](s) = n2 
Earl P (ol ind el Oa 
nil otherwise 
ni — n2 if [e1] (s) = nı and [e2](s) = n2 
e1 — e2] (s) = 4 (i,nı — n2) if Jer] (s) = (i, nı) and Jez] (s) = n2 
nil otherwise 
asais K xno if [e1](s) = nı and [e2](s) = n2 
nil otherwise 


e1 = e2](s) Ê (Je1](s) = [e2](s)) 


nı <n2 if [e1] (s) = nı and fez] (s) = n2 


A 
eı <S e2]() 54 ; 
nil otherwise 


[e nde (s bi Abo if [ei] (s) = bı and [e2](s) = be 
! í nil otherwise 


[ei Or e2 . . 
nil otherwise 


=b if [e](s) =b 


[not e](s) 4 f 
nil otherwise 


n if Jel](s) = (i, n) 


(s) 4 n Vbz if [e1] (s) = bı and [ea] (s) = bz 


[offset e](s)& 4 , 
nil otherwise 
Fig. 4. Expression evaluation 
bind(f, 1) = L 


bind( f, error) £ error 


(IUT, V,m) if f(l,m) = (I,U, m) 


bind( f, (1,1, m)) < error if f (l, m) = error 
pi otherwise 
x if b = true 


if(b,2,y) =< y if b = false 


error otherwise 


Fig. 5. Auxiliary operators bind and if 
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[skip]. (l, m) © (0, l, m) [eas c2]+ (l, m) £ bind( [2] +, [e1]+ (J, m)) 
[if e then c1 else c2} (l, m) £ if ([e](1, m), [e1] (t, m), [eal] +(1,m)) 
[while e do c end], £ fix(A f (I,m). if([e] (l,m), bind([c]4, £(l, m)), (0,1, m))) 
[z + el+ (l,m) (0, lle = Jell, m)l, m) 


etn E mtn 
(0,1, m[(z,n) +> [e2](l,m)]) if [ei] (s) = (i, n) and m(i, n) A L 


error otherwise 


[[e1] < e2]+(s) = 


[xz + alloc(e)]+ (l, m) £ 
e lz > (i,0)], m[(i,k)=0|0< k< n]) if fe](l, m) = n and i = fresh (ids(l, m)) 
error otherwise 


[free(e)] (1, m) ê 


(0,1,m[(i,k) > L |k € Z]) if [e](l, m) = (i, 0) and m(i, n) # L for some n 
error otherwise 


Fig. 6. Command evaluation with explicit allocation sets 


This allows us to define the semantics of iteration (the rule for while e do c end) 
in a standard way using the Kleene fixed point operator fix. 

The definition of [c], appears in Fig.6, where several of the rules use a 
bind operator (Fig.5) to manage the “plumbing” of the sets of allocated block 
ids between the evaluation of one subcommand and the next. The rules for if 
and while also use an auxiliary operator if (also defined in Fig.5) that turns 
non-boolean guards into errors. 

The evaluation rules for skip, sequencing, conditionals, while, and assignment 
are standard. The rule for heap lookup, x + [|e], evaluates e to a pointer and 
then looks it up in the heap, yielding an error if e does not evaluate to a pointer 
or if it evaluates to a pointer that is invalid, either because its block id is not 
allocated or because its offset is out of bounds. Similarly, the heap mutation 
command, [e1] — e2, requires that e evaluate to a pointer that is valid in the 
current memory m (i.e., such that looking it up in m yields something other than 
L). The allocation command x < alloc(e) first evaluates e to an integer n, then 
calculates the next free block id for the current machine state (fresh(ids(/,™m))); 
it yields a new machine state where x points to the first cell in the new block 
and where a new block of n cells is added the heap, all initialized to 0. Finally, 
free(e) evaluates e to a pointer and yields a new heap where every cell sharing 
the same block id as this pointer is undefined. 


The Meaning of Memory Safety 103 


References 


1. 


2: 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


Caja. Attack vectors for privilege escalation (2012). http://code.google.com/p/ 
google-caja/wiki/AttackVectors 

Agten, P., Jacobs, B., Piessens, F.: Sound modular verification of C code executing 
in an unverified context. In: POPL (2015). https://lirias.kuleuven.be/bitstream/ 
123456789/471365/3/sound-verification.pdf 

Ahmed, A., Fluet, M., Morrisett, G.: L?: a linear lan- 
guage with locations. Fundam. Inform. 77(4), 397-449 (2007). 
http://content.iospress.com/articles/fundamenta-informaticae/fi77-4-06 

Askarov, A., Hunt, S., Sabelfeld, A., Sands, D.: Termination-insensitive noninter- 
ference leaks more than just a bit. In: ESORICS (2008). http://www.cse.chalmers. 
se/~andrei/esorics08.pdf 

Azevedo de Amorim, A., Dénès, M., Giannarakis, N., Hritcu, C., Pierce, B.C., 
Spector-Zabusky, A., Tolmach, A.: Micro-policies: formally verified, tag-based secu- 
rity monitors. In: S&P, Oakland (2015). http://prosecco.gforge.inria.fr/personal/ 
hritcu/publications/micro-policies.pdf 

Backes, M., Kopf, B., Rybalchenko, A.: Automatic discovery and quantification of 
information leaks. In: S&P, Oakland (2009). https://doi.org/10.1109/SP.2009.18 
Balabonski, T., Pottier, F., Protzenko, J.: Type soundness and race freedom for 
Mezzo. In: Codish, M., Sumii, E. (eds.) FLOPS 2014. LNCS, vol. 8475, pp. 253-269. 
Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07151-0_16 

Benton, N., Tabareau, N.: Compiling functional types to relational specifications 
for low level imperative code. In: Kennedy, A., Ahmed, A. (eds.) TLDI (2009). 
http://dblp.uni-trier.de/db/conf/tldi/tldi2009.html#BentonT09 

Bhargavan, K., Delignat-Lavaud, A., Maffeis, S.: Defensive JavaScript - building 
and verifying secure web components. In: FOSAD (2013). http://dx.doi.org/10. 
1007 /978-3-319-10082-1_4 

Chisnall, D., Rothwell, C., Watson, R.N.M., Woodruff, J., Vadera, M., Moore, 
S.W., Roe, M., Davis, B., Neumann, P.G.: Beyond the PDP-11: architectural sup- 
port for a memory-safe C abstract machine. In: ASPLOS (2015). https://www.cl. 
cam.ac.uk/~dc552/papers/asplos15-memory-safe-c.pdf 

Clause, J.A., Doudalis, I., Orso, A., Prvulovic, M.: Effective memory protection 
using dynamic tainting. In: ASE (2007). http://www.cc.gatech.edu/~orso/papers/ 
clause.doudalis.orso.prvulovic.pdf 

de Amorim, A.A., Collins, N., DeHon, A., Demange, D., Hritcu, C., Pichardie, D., 
Pierce, B.C., Pollack, R., Tolmach, A.: A verified information-flow architecture. J. 
Comput. Secur. 24(6), 689-734 (2016). https://doi.org/10.3233/JCS- 15784 
Devietti, J., Blundell, C., Martin, M.M.K., Zdancewic, S.: HardBound: architec- 
tural support for spatial safety of the C programming language. In: ASPLOS 
(2008). http://acg.cis.upenn.edu/papers/asplos08_hardbound.pdf 

Devriese, D., Piessens, F., Birkedal, L.: Reasoning about object capabilities with 
logical relations and effect parametricity. In: EuroS&P (2016). http://cs.au.dk/ 
~birke/papers/object-capabilities-tr.pdf 

Dhawan, U., Hritcu, C., Rubin, R., Vasilakis, N., Chiricescu, S., Smith, 
J.M., Knight Jr., T.F., Pierce, B.C., DeHon, A.: Architectural support for 
software-defined metadata processing. In: ASPLOS (2015). http://ic.ese.upenn. 
edu/abstracts/sdmp-_asplos2015.html 

Durumeric, Z., Kasten, J., Adrian, D., Halderman, J.A., Bailey, M., Li, F., Weaver, 
N., Amann, J., Beekman, J., Payer, M., Paxson, V.: The matter of Heartbleed. In: 
IMC (2014). http://doi.acm.org/10.1145/2663716.2663755 


104 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


A. Azevedo de Amorim et al. 


Elliott, T., Pike, L., Winwood, S., Hickey, P.C., Bielman, J., Sharp, J., Seidel, E.L., 
Launchbury, J.: Guilt free Ivory. In: Haskell (2015). https://www.cs.indiana.edu/ 
~lepike/pubs/ivory.pdf 

Fournet, C., Swamy, N., Chen, J., Dagand, P.-É., Strub, P.-Y., Livshits, B.: Fully 
abstract compilation to JavaScript. In: POPL (2013). https://research.microsoft. 
com/pubs/176601/js-star.pdf 

Goguen, J.A., Meseguer, J.: Security policies and security models. In: S&P (1982). 
http://spy.sci.univr.it/papers/Isa-orig/Sicurezza/NonInterferenza/noninter.pdf 
Hicks, M.: What is memory safety? (2014). http://www.pl-enthusiast.net /2014/ 
07/21/memory-safety / 

ISO. ISO C standard 1999. Technical report. ISO/IEC 9899:1999 draft. ISO (1999). 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf 

Jana, S., Shmatikov, V.: Memento: learning secrets from process footprints. In: 
S&P, Oakland (2012). https: //doi-org/10.1109/SP.2012.19 

Kang, J., Hur, C., Mansky, W., Garbuzov, D., Zdancewic, S., Vafeiadis, V.: A 
formal C memory model supporting integer-pointer casts. In: PLDI (2015). https:// 
www.seas.upenn.edu/~wmansky/mcast.pdf 

Krebbers, R.: The C standard formalized in Coq. Ph.D. thesis, Radboud University 
Nijmegen (2015). http://robbertkrebbers.nl/research/thesis.pdf 

Kwon, A., Dhawan, U., Smith, J.M., Knight Jr., T.F., DeHon, A.: Low-fat pointers: 
compact encoding and efficient gate-level implementation of fat pointers for spatial 
safety and capability-based security. In: CCS (2013). http://www.crash-safe.org/ 
node/27 

Leroy, X., Blazy, S.: Formal verification of a C-like memory model and its uses for 
verifying program transformations. JAR 41(1), 1-31 (2008). http://pauillac.inria. 
fr/~xleroy /publi/memory-model-journal.pdf 

Maffeis, S., Mitchell, J.C., Taly, A.: Object capabilities and isolation of untrusted 
web applications. In: S&P, Oakland (2010). https://www.doc.ic.ac.uk/~maffeis/ 
papers/oakland10.pdf 

Memarian, K., Matthiesen, J., Lingard, J., Nienhuis, K., Chisnall, D., Watson, 
R.N.M., Sewell, P.: Into the depths of C: elaborating the de facto standards. In: 
PLDI (2016). http://doi.acm.org/10.1145/2908080.2908081 

Meyerovich, L.A., Livshits, V.B.: Conscript: specifying and enforcing fine-grained 
security policies for JavaScript in the browser. In: S&P, Oakland (2010). http:// 
dx.doi.org/10.1109/SP.2010.36 

Morrisett, G., Felleisen, M., Harper, R.: Abstract models of memory management. 
In: FPCA (1995). http://doi.acm.org/10.1145/224164.224182 

Nagarakatte, S., Zhao, J., Martin, M.M.K., Zdancewic, S.: SoftBound: highly 
compatible and complete spatial memory safety for C. In: PLDI (2009). http:// 
repository.upenn.edu/cgi/viewcontent.cgi?article=1941&context=cis_reports 
Nagarakatte, S., Zhao, J., Martin, M.M.K., Zdancewic, S.: CETS: compiler 
enforced temporal safety for C. In: ISMM (2010). http://acg.cis.upenn.edu/ 
papers/ismm10_cets.pdf 

Necula, G.C., Condit, J., Harren, M., McPeak, S., Weimer, W.: CCured: type-safe 
retrofitting of legacy software. TOPLAS 27(3), 477-526 (2005). https://doi.org/ 
10.1145/1065887.1065892 

Pitts, A.M.: Nominal Sets: Names and Symmetry in Computer Science. Cambridge 
University Press, New York (2013) 

Pottier, F., Protzenko, J.: Programming with permissions in Mezzo. In: ICFP 
(2013) 


36. 


37. 
38. 


39. 


40. 


41. 


42. 


43. 


44. 


45. 


46. 


47. 


48. 


49. 


The Meaning of Memory Safety 105 


Reynolds, J.C.: Separation logic: a logic for shared mutable data structures. In: 
LICS (2002). http://dl.acm.org/citation.cfm?id=645683.664578 

The Rust programming language (2017). http://www.rust-lang.org 

Schlesinger, C., Pattabiraman, K., Swamy, N., Walker, D., Zorn, B.G.: Modular 
protections against non-control data attacks. JCS 22(5), 699-742 (2014). https:// 
doi.org/10.3233/JCS-140502 

Smith, G.: On the foundations of quantitative information flow. In: FoSSaCS 2009. 
http://doi.org/10.1007 /978-3-642-00596-1_21 

Stefan, D., Buiras, P., Yang, E.Z., Levy, A., Terei, D., Russo, A., Mazières, 
D.: Eliminating cache-based timing attacks with instruction-based scheduling. In: 
Crampton, J., Jajodia, S., Mayes, K. (eds.) ESORICS 2013. LNCS, vol. 8134, pp. 
718-735. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40203- 
6-40 

Swamy, N., Hicks, M.W., Morrisett, G., Grossman, D., Jim, T.: Safe manual mem- 
ory management in Cyclone. SCP 62(2), 122-144 (2006). http://www.cs.umd.edu/ 
~mwh/papers/cyc-mm-scp.pdf 

Swasey, D., Garg, D., Dreyer, D.: Robust and compositional verification of object 
capability patterns. In: OOPSLA (2017, to appear). https://people.mpi-sws.org/ 
~swasey /papers/ocpl 

Szekeres, L., Payer, M., Wei, T., Song, D.: SoK: eternal war in memory. In: IEEE 
S&P (2013). http://lenx.100871.net /papers/War-oakland-CR.pdf 

Taly, A., Erlingsson, Ú., Mitchell, J.C., Miller, M.S., Nagra, J.: Automated analysis 
of security-critical JavaScript APIs. In: S&P, Oakland (2011). http://dx.doi.org/ 
10.1109/SP.2011.39 

Turon, A.: Rust: from POPL to practice (keynote). In: POPL (2017). http://dl. 
acm.org/citation.cfm?id=3011999 

Williams, C.: Oracle’s Larry Ellison claims his Sparc M7 chip is hacker-proof 
— errr... The Register (2015). http://www.theregister.co.uk/2015/10/28/oracle_ 
sparc.m7/ 

Yang, E.Z., Maziéres, D.: Dynamic space limits for Haskell. In: PLDI (2014). 
http://doi.acm.org/10.1145 /2594291.2594341 

Yang, H., O’Hearn, P.W.: A semantic basis for local reasoning. In: FoSSaCS (2002). 
http://dl.acm.org/citation.cfm?id=646794.704850 

Zhang, D., Askarov, A., Myers, A.C.: Language-based control and mitigation of 
timing channels. In: PLDI (2012). http://doi.acm.org/10.1145/2254064.2254078 


Open Access This chapter is licensed under the terms of the Creative Commons 
Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), 
which permits use, sharing, adaptation, distribution and reproduction in any medium 
or format, as long as you give appropriate credit to the original author(s) and the 
source, provide a link to the Creative Commons license and indicate if changes were 
made. 


The images or other third party material in this chapter are included in the chapter’s 


Creative Commons license, unless indicated otherwise in a credit line to the material. If 
material is not included in the chapter’s Creative Commons license and your intended 
use is not permitted by statutory regulation or exceeds the permitted use, you will 
need to obtain permission directly from the copyright holder. 


Leakage, Information Flow, 
and Protocols 


S 


Check for 
updates 


Formal Verification 
of Integrity-Preserving Countermeasures 
Against Cache Storage Side-Channels 


2(%3) 


Hamed Nemati!, Christoph Baumann , Roberto Guanciale?, 


and Mads Dam? 


1 CISPA, Saarland University, Saarbrücken, Germany 
hnnemati@cispa.saarland 
2 KTH Royal Institute of Technology, Stockholm, Sweden 
{cbaumann ,robertog,mfd}@kth.se 


Abstract. Formal verification of systems-level software such as hyper- 
visors and operating systems can enhance system trustworthiness. How- 
ever, without taking low level features like caches into account the verifi- 
cation may become unsound. While this is a well-known fact w.r.t. timing 
leaks, few works have addressed latent cache storage side-channels, whose 
effects are not limited to information leakage. We present a verification 
methodology to analyse soundness of countermeasures used to neutralise 
these channels. We apply the proposed methodology to existing coun- 
termeasures, showing that they allow to restore integrity of the system. 
We decompose the proof effort into verification conditions that allow for 
an easy adaption of our strategy to various software and hardware plat- 
forms. As case study, we extend the verification of an existing hypervisor 
whose integrity can be tampered using cache storage channels. We used 
the HOL4 theorem prover to validate our security analysis, applying the 
verification methodology to a generic hardware model. 


1 Introduction 


Formal verification of low-level software such as microkernels, hypervisors, and 
drivers has made big strides in recent years [3,4,17,21,22,33,37,38]. We appear 
to be approaching the point where the promise of provably secure, practical sys- 
tem software is becoming a reality. However, system verification is usually based 
on models that are far simpler than contemporary state-of-the-art hardware. 
Many features pose significant challenges: Memory models, pipelines, specula- 
tion, out-of-order execution, peripherals, and various coprocessors, for instance 
for system management. In a security context, caches are notorious. They have 
been known for years to give rise to timing side channels that are difficult to fully 
counteract [13, 16,26, 28,32,36]. Also, cache management is closely tied to mem- 
ory management, which—since it governs memory mapping, access control, and 
cache configuration through page-tables residing in memory—is one of the most 
complex and security-critical components in the computer architecture flora. 
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Computer architects strive to hide this complexity from application program- 
mers, but system software and device drivers need explicit control over features 
like cacheability attributes. In virtualization scenarios, for instance, it is critical 
for performance to be able to delegate cache management authority for pages 
belonging to a guest OS to the guest itself. With such a delegated authority a 
guest is free to configure its share of the memory system as it wishes, including 
configurations that may break conventions normally expected for a well-behaved 
OS. For instance, a guest OS will usually be able to create memory aliases and 
to set cacheability attributes as it wishes. Put together, these capabilities can, 
however, give rise to memory incoherence, since the same physical location can 
now be pointed to by two virtual addresses, one to cache and one to memory. 
This opens up for cache storage attacks on both confidentiality and integrity, as 
was shown in [20]. Analogous problems arise due to the presence of instruction- 
caches, that can contain binary code that differs from the one stored in memory. 
Differently from timing channels, which are external to models used for formal 
analysis and do not invalidate verification of integrity properties, storage chan- 
nels make the cacheless models unsound: Using them for security analysis can 
lead to conclusions that are false. 

This shows the need to develop verification frameworks for low-level system 
software that are able to adequately reflect the presence of caches. It is partic- 
ularly desirable if this can be done in a manner that allows to reuse existing 
verification tools on simpler models that do not consider caches. This is the goal 
we set ourselves in this paper. 


Our Contributions. We undertake the first rigorous analysis of integrity- 
preserving countermeasures against cache storage channel attacks. We propose 
a practical verification framework, which is independent of a specific hardware 
and the software executing on the platform, and can be used to analyse security 
of low-level software on models with enabled caches. Our framework accom- 
modates both data and instruction caches and we have proved its soundness 
in the HOL4 theorem prover. Our strategy consists in introducing hardware 
and software proof obligations and demonstrating that they prevent attacks on 
integrity. The framework is used to verify soundness of two countermeasures for 
data-caches and two countermeasures for instruction-caches. This results in code 
verification conditions that can be analysed on cacheless models, so that exist- 
ing tools [6,11,31] (mostly not available on cache-enabled models) can automate 
this task to a large extent. To demonstrate that our methodology can be applied 
to commodity hardware, we formally model a generic cache and demonstrate 
that extensions of existing cacheless architectural models with the generic cache 
model satisfy all requirements imposed by our methodology. The practicability 
of our approach is shown by applying it to repair the verification of an existing 
and vulnerable hypervisor [21], demonstrating that the modified design prevents 
cache-attacks. 
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2 Related Work 


Cache Storage Channels. The existence of cache storage channels due to mis- 
matched cacheability attributes was first pointed out in [20]. That paper also 
sketches how prior integrity and confidentiality proofs for a sequential mem- 
ory model could be repaired, identifying that coherency of data-cache is a key 
requirement. However, the verification methodology is only sketched and pro- 
vides merely an intuition about the proof strategy. The present paper develops 
these ideas in detail, providing several new contributions, including (i) a for- 
mal cache-aware hardware model, (ii) a revised and detailed proof strategy that 
allows to decompose verification into hardware-, software-, and countermeasure- 
dependent proof obligations, (iii) introduction and verification of instruction 
cache coherency, (iv) formal definitions of all proof obligations and invariants, 
(v) a detailed explanation of the proof and how the proof obligations can be dis- 
charged for given applications and countermeasures, and (vi) a complete mech- 
anization in HOL4. 


Formal Verification. Recent works on kernel and hypervisor verification [8, 10, 
17-19, 21,24, 25, 33,34] all assume a sequential memory model and leave cache 
issues to be managed by model external means, while the CVM framework [4] 
treats caches only in the context of device management [23]. In [21], a cacheless 
model was used to prove security of the hypervisor used here as a case study. Due 
to absence of caches in the underlying hardware model, the verification result is 
unsound in presence of uncacheable aliases, as demonstrated in [20]. 


Timing Channels. Timing attacks and countermeasures have been formally ver- 
ified to varying degrees of detail in the literature. Since their analysis gen- 
erally ignores caches, verified kernels are susceptible to timing attacks. For 
instance, Cock et al. [13] examined the bandwidth of timing channels in seL4 
and possible countermeasures including cache coloring. Other related work 
includes those adopting formal analysis to either check the rigour of counter- 
measures [5,7,9, 15,20,35] or to examine bandwidth of side-channels [14,27]. 

There is no comparable formal treatment for cache storage channels. These 
channels carry information through memory and, additionally to permitting 
illicit information flows, can be used to compromise integrity. To the best of 
our knowledge we are the first to present a detailed security proof for counter- 
measures against cache storage channel attacks. 


3 Threats, Countermeasures, and Verification Goal 


Data-Caches and Aliases. Modern CPU architectures such as ARM, Power, and 
x64 permit to configure if a given virtual page is cacheable or not. This capability 
can result in a class of attacks called “alias-driven attacks”. Suppose a victim 
reference monitor that (1) validates an input stored in a memory location against 
a security policy and (2) uses such input for implementing a critical functionality. 
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Assume an incoherent state for this memory location: the data-cache contains 
a value for this location that differs from the content of the memory but the 
cache is not dirty. If the cache line is evicted between (1) and (2), its content is 
not written into the memory, since it is not dirty. In this case, the victim can 
potentially evaluate the policy using the value fetched from the cache and later 
use the content stored in memory to implement the critical functionality, allowing 
untrusted inputs to bypass the policy. This behavior has been demonstrated for 
ARMv7 and ARMv8 CPUs [20] as well as for MIPS, where uncacheable aliases 
have been used to establish incoherency. This behavior clearly departs from the 
behavior of a system that has no cache. However, x64 processors that implement 
“self-snooping” appear to be immune to this phenomenon. 

A system that (1) permits an attacker to configure cacheability of its virtual 
memory, (2) acquires ownership of that location from the attacker, and (3) uses 
the location to read security critical information can be target of this attack. An 
example is the hypervisor presented in Sect. 5.5. The runtime monitor presented 
in [12], which forbids the execution of unsigned code, can also be attacked using 
caches. The attacker can load a signed process in cache and a malware in mem- 
ory. Similarly, remote attestation checks the integrity of a device by a trusted 
measuring function. If this function accesses stale data from the caches then the 
measurements can be inaccurate. 

In this paper we analyse two countermeasures against alias-driven attacks: 
“always cacheability” consists in defining a fixed region of memory that is made 
always cacheable and ensuring that the trusted software rejects any input point- 
ing outside this region; “selective eviction” consists in flushing from the cache 
every location that is accessed by the trusted software and that has been pre- 
viously accessed by the attacker. A description and evaluation of other possible 
countermeasures against cache storage channels was provided in [20]. 


Instruction-Caches. In a similar vein, instruction-caches may be dangerous if 
the content of executable pages is changed without using cache management 
instructions to maintain memory coherency. Suppose that a software (1) executes 
instructions from a region of memory, thus filling the instruction-cache with the 
instructions of a program, (2) it updates the memory with the code of a new 
program without flushing the cache, and (3) it executes the new program. Since 
between (1) and (3) some lines of the instruction-cache are evicted and other 
not, the CPU can execute a mix of the code of the two programs, resulting in a 
behavior that is hard to predict. 

The presence of instruction-caches affect systems whose security depends on 
dynamically loaded code. This includes the aforementioned runtime monitor, 
boot-loaders that load or relocate programs, systems that implement dynamic 
code randomization, and Software Fault Isolation [29] (SFI) sandboxes that 
inspect binary code to isolate loadable third party modules. 

We analyse two countermeasures against attacks that use instruction-caches: 
“Constant program memory” ensures the trusted executable code is never mod- 
ified; “Selective eviction” consists in selectively evicting lines of the instruction- 
cache and flushing lines of the data-cache for locations that are modified. 
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3.1 Verification Goals 


In this work we consider a trusted system software (the “kernel” ) that shares the 
system with an untrusted user level software (the “application” ): the application 
requests services from the kernel. The hardware execution mode used by the 
application is less privileged than the mode used by the kernel. The application 
is potentially malicious and takes the role of the attacker. The kernel dynamically 
manages memory ownership and can provide various services, for instance for 
secure ownership transfer. This enables the application to pass data to the kernel 
services, while avoiding expensive copy operations: The application prepares the 
input inside its own memory, the ownership of this memory is transferred to the 
kernel, and the corresponding kernel routine operates on the input in-place. 

Intuitively for guaranteeing integrity we mean that it is not possible for 
the application to influence the kernel using cache features (except possibly for 
timing channels, which are not considered in this work). That is, if there is a 
possibility for the application to affect the kernel behavior (e.g. by providing 
parameters to a system call) in a system with caches, there must be the same 
possibility in an idealized system that has no caches. This goal is usually formal- 
ized by requiring that the cacheless system can simulate all possible executions 
of the system with caches (i.e. all executions of the real system are admitted by 
the specification, that in this case is represented by the cacheless system). 

Unfortunately, ensuring this property for complete executions is not possible: 
since the application is untrusted we need to assume that its code is unknown 
and that it can exploit behaviors of caches that are not available in the cacheless 
system, making impossible to guarantee that the behavior of the application is 
the same in both systems. For this reason, we analyse executions of the applica- 
tion and of the kernel separately. 

We first identify a set of memory resources called “critical”. These are the 
resources for which integrity must be preserved and that affect the kernel behav- 
ior. For example, in an operating system the memory allocator uses a data struc- 
ture to keep track of the ownership of allocated memory pages. Thus all pages 
not belonging to the untrusted process (the application) are considered critical. 
Since this classification depends on the content of the allocator data structure, 
this is also a critical resource. Similarly in [21] the page type data structure 
identifies critical resources. 

Then we phrase integrity as two complementary properties: (1) direct or 
indirect modification of the critical resources is impossible while the application 
is executing on the system with caches; and (2) the kernel has the same behavior 
in the cacheless and the cache-aware system. 

An alternative approach to phrase integrity might be to show the absence of 
information flow from application to kernel. There are a number of issues with 
such an approach in this context, however: First, attacks that do not involve 
information flow would not be covered; Second, it is not clear how an infor- 
mation flow oriented account would handle kernel invocations; these generally 
correspond to endorsement actions in a multi-level security lattice setting and 
are challenging to map to the present setting. On the other hand, our account of 
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integrity permits any safety property that only depends on the critical resources 
and holds for the cacheless system to be transferred to the system with caches. 


4 Formalisation 


As basis for our study we define two models, a cacheless and a cache-aware model. 
The cacheless model represents a memory-coherent single-core system where all 
caches are disabled. The cache-aware model is the same system augmented by a 
single-level separated data- and instruction-cache. 


4.1 Cacheless Model 


The cacheless model is ARM-flavoured but general enough to apply to other 
architectures. A (cacheless) state s = (reg, psreg, coreg, mem) € S is a tuple of 
general-purpose registers reg (including program counter pc), program-status 
registers psreg, coprocessor registers coreg, and memory mem. The core exe- 
cutes either in non-privileged mode U or privileged mode P, Mode(s) € {U, P}. 
Executions in privileged mode are necessarily trusted, since they are able to 
modify the system configuration, e.g., coprocessor registers, in arbitrary ways. 
The program-status registers psreg encode the execution mode and other exe- 
cution parameters such as the arithmetic flags. The coprocessor registers coreg 
determine a range of system configuration parameters, including virtual memory 
mapping and memory protection. The word addressable memory is represented 
by mem: PA > B”, where B = {0,1}, PA is the set of physical addresses, and 
w is the word size. 

Executions in non-privileged mode are unable to directly modify coproces- 
sor registers as well as certain critical program-status registers. For instance, 
the execution mode can be switched to P only by raising an exception. Mem- 
ory accesses are controlled by a Memory Management Unit (MMU), which also 
determines memory region attributes such as cacheability. Let A = {wt, rd, ex} 
be the set of access permissions (for write, read, and execute respectively) and 
M = {U, P} be the set of execution modes. The MMU model is the function 
MMU(s, va) € (2“*4 x PA x B) which yields for a virtual address va € VA the 
set of granted access rights, the translation, and the cacheability attribute. Note 
that the same physical addresses can be accessed with different access rights and 
different cacheability settings using different virtual aliases. Hereafter, when it 
is clear from the context, we use MMU (s, va) to represent the translation of va. 

The behaviour of the system is defined as a labeled transition system using 
relation >,,C S x S, where m € M and if s >, s’ then Mode(s) = m. Each 
transition represents the execution of a single instruction. When needed, we 
let s >m s’ [ops] denote that the operations ops are performed on the mem- 
ory subsystem, where ops is a list whose elements are either wt(pa,c) (pa was 
written with cacheability attribute c), rd(pa,c) (pa was read with cacheability 
attribute c), flp (pa), or fir(pa) (the data- or instruction-cache flush operation for 
pa, which have no effects in the cacheless model). We use sow» Sn to represent the 
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weak transition relation that holds if there is an execution sg > --- —> Sn such 
that Mode(s,) = U and Mode(s;) Æ U for 0 < j < n, i.e. the weak transition 
hides states while the kernel is running. 


4.2  Cache-Aware Model 


We model a single-core processor with single-level separated instruction and 
data-caches, i.e., a modified Harvard architecture. In Sect. 7 we discuss variations 
and generalizations of this model. 

A state 5 € S in the cache-aware model has the components of the cacheless 
model together with a data-cache d-cache and an instruction-cache i-cache, 5 = 
(reg, psreg, coreg, mem, d-cache, i-cache). The function MMU and the transition 
relation > mC S x S are extended to take into account caches. 

Other definitions of the previous subsection are extended trivially. We use 
d-hit(5, pa) to denote a data-cache hit for address pa, d-dirty(S, pa) to identify 
dirtiness of the address pa (i.e. if the value of pa has been modified in cache and 
differs from the memory content), and d-cnt(§,pa) to obtain the value for pa 
stored in the data-cache (respectively i-hit(5, pa), i-dirty(S, pa), and i-cnt(5, pa) 
for the instruction-cache). 

Due to the use of the modified Harvard architecture and the presence of 
caches, there are three views of the memory subsystem: the data-view Dv, the 
instruction-view Iv, and the memory-view Mv: 


Do(8, pa) = if d-hit(5, pa) then d-cnt(3, pa) else §.mem(pa) 
Iv(5, pa) = if i-hit(5, pa) then i-cnt(S, pa) else 5.mem/(pa) 
Mv(8, pa) = if d-dirty(S, pa) then d-cnt(S, pa) else 5.mem(pa) 


We require that the kernel always uses cacheable virtual aliases. There- 
fore, kernel reads access the data-view and kernel instruction fetches access the 
instruction-view. Moreover, the MMU always consults first the data-cache when 
it fetches a page-table descriptor, as is the case for instance in ARM Cortex-A53 
and ARM Cortex-A8. Therefore, the MMU model uses the data-view. Finally, 
the memory-view represents what can be observed from the data-view after non- 
dirty cache lines have been evicted. 


4.3 Security Properties 


As is common in designs of low-level software, we assume that the kernel uses 
a static region of virtual memory Kym C VA for its memory accesses and that 
the static region Kex C Kym maps the kernel code. 

We first identify the critical resources, i.e., those resources for which integrity 
must be preserved and on which kernel behavior depends. This set always 
includes the coprocessor registers, which the architecture protects from non- 
privileged modifications. The security type of memory locations, however, can 
dynamically change due to transfer of memory ownership, i.e., the criticality 
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of resources depends on the state of the system. The function CR : S > 2P4 
retrieves the subset of memory resources that are critical. Function CR itself 
depends on a subset of resources, namely the internal kernel data-structures 
that determine the security type of memory resources (for the kernel being an 
operating system and the application being one of its user processes, imagine 
the internal state of the page allocator and the process descriptors). Similarly, 
the function EX : S > 2” retrieves the subset of critical memory resources that 
contain trusted executable code. These definitions are naturally lifted to the 
cacheless model, by extending a cacheless state with empty caches. Two states 
5 and 8’ have the same data-view (respectively instruction-view) of the critical 
memory, written 5 =p 5 (respectively 5 =; 3’), if 


{(pa, Du(8,pa)) | pa € CR(8)} = {(pa, Do(8’, pa)) | pa € CR(s')} 


(respectively Iv and EX). Finally, two states 5 and 5’ have the same critical 
resources, and we write 5 =cp 5’, iff 5 =p 5’, 5 =, 5’, and 5.coreg = 3'.coreg. 

Our verification approach requires to introduce a system invariant J that 
is software dependent and defined per kernel. This invariant ensures that the 
kernel can work properly (e.g. stack pointer and its data structures are correctly 
configured) and its properties are detailed in Sect. 5. A corresponding invariant I 
for the cacheless model is derived from J by excluding properties that constrain 
caches. Our goal is to establish two theorems: an application integrity theorem 
showing that J correctly constrains application behaviour in the cache-aware 
model, and a kernel integrity theorem showing that kernel routines in the cache- 
aware model correctly refine the cacheless model. 

As the application is able to break its memory coherency at will, the appli- 
cation integrity theorem is a statement about the processor hardware and its 
correct configuration. In particular, Theorem 1 shows that non-privileged execu- 
tion in the cache-aware model preserves the required invariant, that the invariant 
is adequate to preserve the critical resources, and that entries into privileged level 
correctly follow the hardware mode switching convention. For the latter, we use 
predicate ex-entry(S) to identify states of the system immediately after switching 
to the kernel, i.e., when an exception is triggered, the mode becomes privileged 
and the program counter points to an entry in the exception vector table. 


Theorem 1 (Application Integrity). For all 5, if I(8) and 3 >y 3' then 
I(5'), 5=cr 3’, and if Mode(s’) 4 U then ex-entry(3’). 


For the kernel we prove that the two models behave equivalently. We prove this 
using forward simulation, by defining a simulation relation Rim guaranteeing 
equality of all registers and critical memory resources, and then showing that 
both the invariant and the relation are preserved by privileged transitions: 


Theorem 2 (Kernel Integrity). For all 5; and sı such that I(33), 81 Resim 
sı, and ex-entry(81), if 51 w» 52 then Js2. $1 w S2, 52 Rsim $2 and I(S2). 
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5 Proof Strategy 


Theorems 1 and 2 are proved in five steps: 


1. First we introduce crucial properties of the hardware, abstracting from the 
details of a specific hardware architecture. We obtain a set of proof obligations 
(i.e. HW Obligation) that must be discharged for any given hardware. 

2. Next, we reduce application integrity, Theorem 1, to proof obligations (i.e. 
SW-I Obligation) on software-specific invariants of the cache-aware model. 

3. The same approach applies for kernel integrity, Theorem 2, where we also 
derive proof obligations (i.e. SW-C Obligation) on the kernel code. 

4. We then demonstrate correctness of the selected countermeasures of Sect. 3 
by discharging the corresponding proof obligations. 

5. The last step is kernel-specific: we sketch how our results allow standard 
cache-oblivious binary analysis tools to show that a kernel implements the 
countermeasures, establishing Theorems 1 and 2. 


A fundamental notion for our proof is coherency, which captures memory 
resources whose content cannot be indirectly effected through cache eviction. 


Definition 1 (Data-Coherency). We say that a memory resource pa € PA is 
data-coherent in 5, D-Coh(5, pa), iff d-hit(S,pa) and d-cnt(5, pa) 4 5.mem (pa) 
implies d-dirty(5, pa). A set R C PA is data-coherent iff all pa € R are. 


In other words, a physical location pa is data-coherent if a non-dirty cache hit 
of pa in 5 implies that the cached value is equal to the value stored in memory. 
The general intuition is that, for an incoherent resource, the view can be changed 
indirectly without an explicit memory write by evicting a clean cache-line with 
different values in the cache and memory. For instance, consider an MMU that 
looks first into the caches when it fetches a descriptor. Then if the page-tables are 
coherent, a cache eviction cannot indirectly affect the behaviour of the MMU. 
This intuition also underpins the definition of instruction-coherency. 


Definition 2 (Instruction-Coherency). We say that a memory resource 
pa € PA is instruction-coherent in 5, I-Coh(5, pa), iff the following statements 
hold: 


1. pa is data-coherent, 
2. if i-hit(S, pa) then i-cnt(S, pa) = 5.mem(pa), and 
3. ad-dirty(S, pa) 


Instruction-coherency requires the data-cache to be not dirty to ensure that 
eviction from the data-cache does not break part (2) of the definition. 

The role of coherency is highlighted by the following Lemma. The memory- 
view differs from the data-view only in memory resources that are cached, clean, 
and have different values stored in the cache and memory, and data-view differs 
from instruction-view only for resources that are not instruction-coherent. 


Lemma 1. Let pa € PA and 5 € S. Then: 


1. D-Coh(8, {pa}) = (Dv(3, pa) = Mv(5, pa)). 
2. I-Coh(3, {pa}) => (Do(S, pa) = Iv(, pa)). 
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5.1 Hardware Abstraction Layer 


ISA models are complex because they describe the behavior of hundreds of possi- 
ble instructions. For this reason we introduce three key notions in order to isolate 
verification tasks that are architecture-dependent and that can be verified once 
and reused for multiple countermeasures and kernels. These notions are: 


1. MMU Domain: This identifies the memory resources that affect the virtual 
memory translation. 

2. Derivability: This provides an overapproximation of the effects over the mem- 
ory and cache for instructions executed in non-privileged mode. 

3. Instruction Dependency: This identifies the memory resources that affect the 
behavior of the current instruction. 


Here we provide an intuitive definition of these notions and formalize the prop- 
erties that must be verified for the specific hardware model to ensure that these 
abstractions are sound. Section 6 comments on the verification of these properties 
for a generic hardware model in HOL4. 

MMU domain is the function MD(s,V) C PA that determines the memory 
resources (i.e., the current master page-table and the linked page-tables) that 
affect the translation of virtual addresses in V C VA. 


HW Obligation 1 


1. MD is monotone, i.e., V'C V implies MD(5,V') C MD(5,V). 

2. For all 5,5' and V C VA if Do(3,pa) = Do(s', pa) for all pa € MD(5,V) 
and 5.coreg = 5'.coreg then MD(5,V) = MD(5',V) and for all va € V, 
MMU (5, va) = MMU (3', va). 


Definition 3 (Derivability). We say 5’ is derivable from 5 in non-privileged 
mode (denoted as 5œ 3') if 5.coreg = 5'.coreg and for every pa € PA at least one 
of Dace properties and at least one of Tace hold: 


Do(8, 8’, pa): Independently of the access rights for the address pa, a data-cache 
line can always change due to an eviction. An eviction of a dirty cache entry 
causes a write back; eviction of clean entries does not affect the memory. 

D,a(3, 8’, pa): If non-privileged mode can read the address pa, the value of pa in 
the memory can be filled into its data-cache line, making it clean. 

Duz(3, 8’, pa): If non-privileged mode can write the address pa, it can either write 
directly into the data-cache, potentially making it dirty, or bypass it, by using 
an uncacheable alias. Only writes can make a location in data-cache dirty. 

1g (8, 8’, pa): Independently of the access rights for the address pa, the correspond- 
ing line can always be evicted, leaving memory unchanged. 

Lex (8, 5’, pa): If non-privileged mode can execute the address pa, the instruction- 
cache state can change through a fill operation which updates the cache with 
the value of pa in the memory. Instruction-cache lines never become dirty. 
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Do(5, 5’, pa) = (M' (pa) # M (pa) => (d-dirty(5, pa) A d-hit(s’, pa) A M'(pa) = d-cnt(5, pa))) 
A^ (d-W(8', pa) # d-W(5, pa) => (~d-hit(5', pa) A (d-dirty(5, pa) = M’ (pa) = d-cnt(§, pa)))) 


def 


Dra(8, 8’, pa) = (MMU(8, pa, U, rd, -) V pa € MD(8)) A M'(pa) = M (pa) 
A (d-W(8', pa) # d-W(5, pa) = ~d-hit(5, pa) A d-hit(s’, pa) A d-dirty(8’, pa) A d-cnt(s’, pa) = M (pa)) 


Duit(5, 5’, pa) = MMU(3, pa, U, wt, -) 
A^ (d-W(8', pa) 4 d-W(5, pa) = d-dirty(8', pa) V M' (pa) = d-cnt(3’, pa)) 
A^ (M'(pa) # M (pa) => (-d-dirty(s', pa) > MMU(S, pa, U, wt, false))) 


Ig(8, 5’, pa) = i-W(5', pa) 4 i-W(8, pa) => —i-hit(s’, pa) 


Tex (8, 3’, pa) = MMU(38, pa, U, ez, -) 
A (i-W(5', pa) 4 i-W(5, pa) => ni-hit(5, pa) A i-hit(5', pa) A ni-dirty(5', pa) A i-cnt(5', pa) = M (pa)) 


Fig. 1. Derivability. Here d-W(5, pa) = (d-hit(5, pa), d-dirty(5, pa), d- cnt(5,pa)) and 
i-W(s,pa) = (i-hit(S, pa), i-dirty(S, pa), i-cnt(S,pa)) denote the cache-line contents 
corresponding to pa in §.d-cache and 8.i-cache, M = 3.mem, M’ = 3'.mem, and 
MMU (3, pa, U, acc, c) = Jva. MMU (38, va, U, acc) = (pa, c). 


Figure 1 reports the formal definition of these predicates for a cache oper- 
ating in write-back mode, assuming cache line granularity is finer than page 
granularity, i.e., the same memory permissions hold for all entries of a given 
line. 

Note that in a cache, one cache line contains several locations and that writing 
one such location marks the whole line of the data-cache dirty. However, due to 
our definition of d-dirty the locations in the written line are not considered dirty, 
if they have the same value in cache as in memory. 

In practice, if 5 œ 5’ then for a given location D-Coh can be invalidated only 
if there exists a non-cacheable writable alias and -Coh can be invalidated only 
if there exists a writable alias. The following obligation shows that derivability 
correctly overapproximates the hardware behavior: 


HW Obligation 2. For all such that D-Coh(s, MD(s, VA)) and MD(s, VA)N 
{pa | dua. MMU (5, va) = (acc, pa,c) and (U, wt) € acc} = 0, if 5 is reachable 
by a non-privileged transition, i.e. 5 y 3’, then 


1. 5> 8", i.e., 3’ is derivable from 5, and 
2. if Mode(s’) 4 U then ex-entry(s’), i.e., the mode can only change by entering 
an exception handler 


The precondition of the obligation requires the MMU domain to be data-coherent 
and to not overlap with the memory writable in non-privileged mode. This 
ensures that the MMU configuration is constant during the execution of instruc- 
tions that update multiple memory locations. This requirement also ensures 
transitivity of derivability. 

To complete the hardware abstraction we need sufficient conditions to ensure 
that the cache-aware model behaves like the cacheless one. We use the functions 
p-deps(s) C PA and v-deps(s) C VA to extract an overapproximation of the 
physical and virtual addresses that affect the next transition of 5. For instance, 
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v-deps includes the program counter, the locations loaded and stored, while 
p-deps(s) includes the translation of the program counter, the translation of the 
virtual addresses read, and the addresses that affect the translation of v-deps (i.e. 
MD(5, v-deps(S))). As usual, these definitions are lifted to the cacheless model 
using empty caches. We say that 5 and s are similar, if (5.reg, 5.psreg, 5.coreg) = 
(s.reg, s.psreg, s.coreg), Du(5, pa) = s.mem (pa) for all pa in p-deps(s)Mp-deps(8), 
and Iu(s, MMU (5, 8.reg.pc) = s.mem( MMU (s, s.reg.pc)). 


HW Obligation 3. For all similar 5 and s 


1. p-deps(S) = p-deps(s) and v-deps(5) = v-deps(s) 
2. if 8m 8’ [ops;|, 5m 5 [ops] and all accesses in ops, are cacheable (i.e. 
wt(pa,c) € ops, or rd(pa,c) € ops, implies c) then 
(a) ops, = ops, 
(b) (8'.reg, 5’ .psreg, 5'.coreg) = (s'.reg, s' .psreg, s’.coreg) 
(c) for every pa if wt(pa,c) € ops, then Dv(s', pa) = s’.mem(pa), 
otherwise Mu(5, pa) = Mv(5', pa) and s.mem(pa) = s’.mem(pa) 


The obligation, thus, is to show that if 5 and s are similar, then their instructions 
have the same dependencies; the same physical addresses are read, written, and 
flushed; registers are updated in the same way; addresses written have the same 
values; addresses that are not written preserve their memory view. 

The last obligation describes cache effects of operations: 


HW Obligation 4. For every 5 if 3 >m 3’ [ops] and all accesses in ops are 
cacheable then 


1. for every pa if wt(pa,c) € ops then D-Coh(3', {pa}), 
otherwise D-Coh, I-Coh and -d-dirty of pa are preserved 
2. if flp(pa) € ops then D-Coh(s', {pa}) and ad-dirty(S, pa) 
3. if flr(pa) € ops, D-Coh(s, {pa}), and -d-dirty(8, pa) then I-Coh(3', {pa}) 


If the kernel only uses cacheable aliases then memory writes establish data- 
coherency; data- and instruction-coherency, as well as non-dirtyness are pre- 
served for non-updated locations; data-cache flushes establish data-coherency 
and make locations non-dirty; instruction-cache flushes make data-coherent, non- 
dirty locations instruction-coherent. 


5.2 Application Level: Theorem 1 


To decompose the proof of Theorem 1, the invariant J is split in three parts: 
a functional part Thun which only depends on the data-view of the critical 
resources, an invariant Teoh, Which only depends on data-coherency of the criti- 
cal resources and instruction-coherency of executable resources, and an optional 
countermeasure-specific invariant Jem which depends on coherency of non-critical 
memory resources such as resources in an always-cacheable region: 


SW-I Obligation 1. For all 3, I(8)=TIfun(3)A Icon(3)A Lem(8) and: 
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1. for all 3' if 3 =cr 3 then Ipun(3) =Ifun(5'); 

2. for all 5’ if 5 =cr 5, D-Coh(5, CR(8)), D-Coh(s', CR(8’)), I-Coh(5, EX (8)), 
and I-Coh(3', EX (3')), then Tanke 3) = Icon (8’) 

3. for all 5’ if I(3) and §> 8’ then Toast ). 


The invariants must prevent direct modification of the critical resources by 
the application, i.e., there is no address writable in non-privileged mode that 
points to a critical resource. Similarly, indirect modification, e.g., by line eviction, 
must be impossible. This is guaranteed if critical resources are data-coherent and 
executable resources are instruction-coherent. 


SW-I Obligation 2. For all s: 


1. If Ipm(8) and pa € CR(8) then there is no va such that MMU(3,va) = 
(acc, pa,c) and (U, wt) € acc 
2. If Ifun(8) and Icon(8) then D-Coh(8, CR(8)) and I-Coh(s, EX (8)) 


Also, the functions CR and EX must be correctly defined: resources needed to 
identify the set of critical kernel resources are critical themselves, as are resources 
affecting the MMU configuration (i.e., the page-tables). 


SW-I Obligation 3. For all 5,5’: 


1. If Ifun(3), 5 =p 5 and 3.coreg = 3'.coreg then CR(3) = CR(3'), EX(8) = 
EX (s'), and EX(8) C CR(s) 
2. If Trun(3) then MD(3,VA) € CR(3) 


The following lemmas assume HW Obligation 2 and SW-I Obligations 1-3. 
First, we show that the application cannot modify critical resources. 


Lemma 2. For all 3,3’ such that I(5) if 3 © 3’ then 3=or 3. 


Proof. Since I(3) holds, the MMU prohibits writable accesses of the applica- 
tion to critical resources (SW-I Obligation 2.1). Also, derivability shows that the 
application can directly change only resources that are writable according to the 
MMU. Thus, the application cannot directly update CR(S). Besides, the invari- 
ant guarantees data-coherency of critical resources and instruction-coherency of 
executable resources in 5 (SW-I Obligation 2.2). This prevents indirect modifi- 
cations of these resources. Finally, SW-I Obligation 3.1 ensures that the kernel 
data-structures that identify what is critical cannot be altered. | 


To complete the proof of Theorem1 we additionally need to show that 
coherency of critical resources (Lemma 3) and the functional invariant (Lemma 4) 
are preserved by non-privileged transitions. 


Lemma 3. For all 3 if I(5) and 3>8' then D-Coh(3', CR(s’)), I-Coh(3', EX(s’)). 
Proof. From the previous lemma we get CR(s’) = CR(S) and EX(s’) = EX(8). 


Coherency of these resources in § is given by SW-I Obligation 2.2. From derivabil- 
ity we know that data-coherency can be invalidated only through non-cacheable 
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writes; instruction-coherency can be invalidated only through writes to exe- 
cutable resources. SW-I Obligation 2.1 yields that there is no alias writable in 
non-privileged mode pointing to a critical resource, using SW-I Obligation 3.1 
then also executable resources cannot be written. | 


Lemma 4. For all 3 and 3’ if I(5) and 5 >y 3! then Tpun(3'). 


Proof. To show that non-privileged transitions preserve the invariant we use HW 
Obligation 2.1, Lemma 2, and SW-I Obligation 1.1. E 


We are now able to complete the proof of application integrity. The following 
Lemma directly proves Theorem 1 if the proof obligations are met. 


Lemma 5 (Application Integrity). For all 3, if I(5) and 5 >y 5 then 
I(5'), 5=cr 3', and if Mode(s’) 4 U then ex-entry(s’). 


Proof. By HW Obligation 2, 5 > 5 and if Mode(s’) 4 U then ez-entry(5'). By 
Lemma 2, 3 =cr 3’. By Lemma 4, Ifun(8’). By Lemma 3, D-Coh(3’,CR(8’)) and 
I-Coh(5',EX (3’)). By SW-I Obligation 2.2, D-Coh(5,CR(5)) and I-Coh(5,EX (5)). 
By SW-I Obligation 1.2 and I(8), Icon(8’). Then by SW-I Obligation 1.3, Tem (3’), 
thus (3’) = Tpun(3’) A Teon(3’) A Iem(8’) holds. E 


5.3 Kernel Level: Theorem 2 


Our goal is to constrain kernel execution in such a way that it behaves identically 
in the cache-aware and the cacheless model. The challenge is to find suitable 
proof obligations for the kernel code that are stated on the cacheless model, so 
they can be verified using existing tools for binary analysis. 

The first code verification obligation requires to show that the kernel pre- 
serves the invariant when there is no cache: 


SW-C Obligation 1. For all s,s’ if I(s), ex-entry(s), and s w s', then I(s'). 


We impose two requirements on the kernel virtual memory: the addresses 
in Kym must be cacheable (so that the kernel uses the data-view of memory 
resources) and Kex must be mapped to a subset of the executable resources. 


SW-I Obligation 4. For all s such that I(s): 


1. For every va E€ Kym if MMU(s, va) = (acc, pa, c) then c holds. 
2. For every va € Kex if MMU(s, va) = (acc, pa,c) then pa E€ EX(s). 


A common problem of verifying low-level software is to couple the invariant 
with every possible internal state of the kernel. This is a major concern here, 
since the set of critical resources changes dynamically and can be stale while 
the kernel is executing. We solve this problem by defining an internal invariant 
IT(s, s’), which allows us to define properties of the state s’ in relation with the 
initial state s of the kernel handler. 
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Definition 4. The intermediate invariants II(s,s’) for the cacheless model and 
II(8, 8’) for the cache-aware model hold if: 


1. s’.reg.pc E€ Kex and 8’.reg.pc € Kex, respectively, 

2. for all pa € PA: if pa E€ MD(s,Kym) then s.mem(pa) = s'.mem(pa) and if 
pa E€ MD(8, Kym) then Do(s, pa) = Dv(3’, pa), respectively, 

3. v-deps(s’) C Kym and v-deps(5') C Kym, respectively, 

4. IDem(s, 8’) and II em(8, 8’), respectively: additional countermeasure-specific 
requirements that will be instantiated in Sect. 5.4, and 

5. only for the cache-aware model: D-Coh(s’, CR(8)). 


Now we demand a proof that the intermediate invariant is preserved in the 
cacheless model during kernel execution, i.e., that (1) the kernel does not execute 
instructions outside its code region, (2) the kernel does not change page-table 
entries that map its virtual memory, (3) the kernel does not leave its virtual 
address space, and (4) the kernel implements the countermeasure correctly. 


SW-C Obligation 2. For all s,s’ if I(s), ex-entry(s), and s 5 3s’, then 
IT(s, s’). 


We require to demonstrate correctness of the countermeasure, by showing 
that it guarantees coherency of dependencies during kernel execution. 


SW-I Obligation 5. For all 3,3’, if I(5), Iem(8, 8’), and 3'.reg.pe € Kex then 
D-Coh(5', p-deps(8’)) and I-Coh(s’, MMU(8’, 8’ .reg.pc)). 


We introduce the simulation relation between the two models: 5 Resim s 
iff (5.reg, §.psreg, §.coreg) = (s.reg, s.psreg,s.coreg) and for all pa, Mv(5, pa) = 
s.mem(pa). The intuition in using the memory-view is that it is equal to the 
data-view for coherent locations and is unchanged (as demonstrated by HW 
Obligation 3) for incoherent locations that are not directly accessed by the kernel. 

The following proof obligation connects the simulation relation, the invariants 
and the intermediate invariants: (1) the invariant of the cache-aware model can 
be transferred to the cacheless model via the simulation; (2) after the execution 
of a handler (i.e. Mode(s’) = U) if the two intermediate invariants hold then the 
simulation allows to transfer the functional invariant of the cacheless model to 
the cache-aware model and guarantees coherency of critical resources; and (3) the 
cache-aware intermediate invariant ensures the countermeasure requirements. 


SW-I Obligation 6. For all 35,s such that 5 Rsim s and I(3) 


1. I(s) holds and II ¢m(s,s8) implies IT om (3,8), 

2. for all 3’, s’ such that 3’ Rsim s’ if II(s, 3’), II(s,s'), I(s'), and Mode(s’) = U 
then Trun(3’) and Icon (3’), and 

3. for all s' if Tpw(® )y Icon(5'), and 11(3,8' ) then Iem(5'). 


The following lemmas assume that the proof obligations hold. First we show 
that the intermediate invariant can be transferred from the cacheless to the 
cache-aware model. 
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Lemma 6. Suppose that 5o Rsim 80, 5 Rsim 8, 5 >p 5'[ops], s —>p s'[ops], and 
5’ Rsim 5’. If I(80), II (so, 8), (so, 8’), and IT(8o,8) then IT(8o, 5’). 


Proof. Transferring the property of Definition 4.1 from s’ to 3’ is trivial, since 
Rsim guarantees equivalence of registers. 

For Definition 4.5 we show that the kernel only performs cacheable accesses 
in ops from s (due to SW-I Obligation 4 and HW Obligation 1.2); these are the 
same accesses performed in 3; CR(o) is data-coherent in 5 due to II(8 9,8); 
coherency is preserved from 5 to 5’ due to HW Obligation 4. 

For Definition 4.2: Let D = MD(s0, Kym); I (so, s’) ensures that the memory 
in D is the same in sọ, s, and s’; Rsim guarantees that the memory-view of D in 
So is the equal to the content of the memory in so; D is data-coherent in 5ọ by HW 
Obligation 1.1, SW-I Obligations 3.2 and 2.2, hence by Lemma 1 the data-view of 
D in Sp is equal to its memory content in so and s’; also D = MD(8 0, Kym) due 
to HW Obligation 1.2; similarly, Rsim guarantees that the memory-view of D in 
5’ is equal to the memory content of D in s’; then locations D have the same 
data-view in 59 and 5 via Lemmal, if D is coherent in 5’. This follows from 
D-Coh(s’, CR(8o)) (shown above), HW Obligation 1.1, and SW-I Obligation 3.2. 

For Definition 4.4 we rely on a further proof obligation that demonstrates 
correctness of the countermeasure: if the software implements the countermea- 
sure in the cacheless model, then the additional coherency requirements on the 
cache-aware model are satisfied. 


SW-I Obligation 7. Assume 50 Resim $0, 5 Rsim 8, 5 >p 3'[ops], s >p 
s'[ops], and 5 Rsim 8’. If I(50), H(so,s), H(so,s'), and I(50,5) then 
IT em(50, 5’). 


From this we also establish coherency of the dependencies of 5’ (due to SW- 
I Obligation 5), thus the data-view and the memory-view of the dependencies 
of 5’ are the same (Lemma 1). The dependencies of s’ and 3’ have the same 
memory content via the simulation relation. Therefore s’ and 5’ are similar; by 
HW Obligation 3.1, we transfer the property of Definition 4.3 from s’ to 3’. E 

The following lemma shows that the simulation relation and the intermediate 
invariant is preserved while the kernel is executing. 


Lemma 7. Suppose that I(3), ex-entry(S), and 5 Resim s. If 5 35 5' then s >5 
s' for some s such that 3’ Rsim s’, II(5, 8), and II(s,s’). 


Proof. Internal invariant JI(s, s’) is directly obtained from SW-C Obligation 2. 
We prove the remaining goals by induction on the execution length. Simulation 
in the base case is trivial, as no step is taken, and TI (5, 5) follows from IT(8, 3), the 
coherency of critical resources in 5, SW-I Obligations 6.1 and 5, the simulation 
relation and Lemma 1, as well as HW Obligation 3.1. 

For the inductive case we first show that the simulation relation is preserved. 
Rim guarantees that s’ and 5’ have the same registers, SW-I Obligation 5 ensures 
that the memory pointed by the program counter is instruction-coherent and the 
instruction dependencies are data-coherent. Therefore, by Lemma 1 and R sim, we 
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Fig. 2. Verification of kernel integrity: inductive simulation proof and invariant transfer 


can ensure all preconditions of HW Obligation 3, which shows that the simulation 
is preserved. We use Lemma.6 to transfer the intermediate invariant. a 


Figure 2 indicates how the various proof obligations and lemmas of the section 
tie together. We are now able to complete the proof of kernel integrity. The 
following lemma directly proves Theorem 2 if the proof obligations are met. 


Lemma 8 (Kernel Integrity). For all 3, and sı such that I( 


51), 51 Resim $1, 
and ex-entry(5,), if 51 w 82 then IS2. S1 w S2, S2 Raim S2 and I(g 


2)- 


Proof. From 81 w 52. we have 8; —'% S52 for some n; by Lemma7 we find s2 
such that sı >% s2, 52 Resim $2, I (31,52), and I(s1,82). Then sı ws S2 as s2 
and 52 are in the same mode. By SW-I Obligation 6.1 we obtain I(s,). Then by 
SW-C Obligation 1, [(s2). SW-I Obligation 6.2 yields Tun (32) and Icon (82), and 
by SW-I Obligation 6.3, I¢m (2). It follows that I(52) holds, as desired. E 


5.4 Correctness of Countermeasures 


Verification of a countermeasure amounts to instantiating all invariants that are 
not software-specific and discharging the corresponding proof obligations. We 
verify combinations of always cacheablility or selective eviction of the data-cache, 
and constant program memory or selective eviction of the instruction-cache. 


Always Cacheablility and Constant Program Memory. Let Mac C PA be the 
region of physical memory that must always be accessed using cacheable aliases. 
The software needs to preserve two properties: (8.1) there are no uncacheable 
aliases to Mac, (8.2) the kernel never allocates critical resources outside Mac: 


SW-I Obligation 8. If I(s) holds, then: 


1. For every va, if MMU (s, va) = (acc, pa,c) and pa € Mac then c. 
2. CR(s) C Mac. 


For this countermeasure, the non-functional invariants are defined as follows 
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Icon (8) states that critical resources are data-coherent and executable resources 
are instruction-coherent (from which SW-I Obligation 1.2 and SW-I Obliga- 
tion 2.2 follow directly). 

Tem(8) states that addresses in Mac that are not critical are data-coherent (SW-I 
Obligation 1.3 holds as there are no uncacheble aliases to Mac). 

IT em(s, 8’) states that dependencies of instructions in s’ are in Mac, no kernel 
write targets EX (3) (i.e. there is no self-modifying code), and when the kernel 
handler completes EX (5') C EX(8). 

TI om(8, 8’) states that dependencies of instruction in 3’ are in Mac, Mac is data- 
coherent, and EX (8) is instruction-coherent (SW-I Obligation 5 holds due to 
SW-I Obligation 4, i.e., the kernel fetches instructions from EX (5) only). 


The cache-aware functional invariant Ipun is defined equivalently to J using 
Dv(8, pa) in place of s.mem(pa). This and the two intermediate invariants enable 
to transfer properties between the two models, establishing SW-I Obligation 6. 

The proof of SW-I Obligation7 (i.e. the cache-aware intermediate invari- 
ant ITem is preserved) consists of three tasks: (1) data-coherency of Mac is 
preserved, since SW-I Obligation4 and IZ imply that the kernel only per- 
forms cacheble accesses, therefore, data-coherency cannot be invalidated; (2) 
instruction-coherency is guaranteed by the fact that there is no self-modifying 
code and HW Obligation 4; (3) the hypothesis of HW Obligation 3.1 (which 
shows that cacheless and cache-aware model have the same dependencies) is 
ensured by the fact that cacheless dependencies are in Mae which is data- 
coherent. 


Selective Eviction of Data-Cache and Constant Program Memory. Differently 
from always cacheability, selective eviction does not require to establish a func- 
tional property (i.e. SW-I Obligation 8). Instead, it is necessary to verify that 
resources acquired from the application are accessed by the kernel only after 
they are made coherent via cache flushing. For this purpose, we extend the two 
models with a history variable h that keeps track of all effects of instruction 
executed by the kernel (i.e. s >m 8’ [ops] then (s,h) >, (s’,h’) [ops] and 
h’ = h; ops). Let C(s,s’) be the set of resources that were critical in s or that 
have been data-flushed in the history of s’. Hereafter we only describe the parts 
of the non-functional invariants that deal with the data-cache, since for the 
instruction-cache we use the same countermeasure as in the previous case. 


Teon(8) is the same as always cacheability, while Tem(5) = true, since the coun- 
termeasure is not a state-based property. 

IZ em(s, 8’) (and II em(3,5’)) states that dependencies in s’ are in C(s,s’) 
(C(5, 3’), respectively) and that CR(s’) C CR(s)U C(s, s’) if Mode(s’) = U. 


Again, the cache-aware functional invariant J, ‘fun is defined equivalently to J using 
the data-view of memory resources. 

The proofs of SW-I Obligation6 and SW-I Obligation 7 are similar to the 
ones above. Instead of Mac they rely on the data-coherency of C(5, 5’) and the 
fact that data-cache flushes always establish coherency (HW Obligation 4). 
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Selective Eviction of Instruction-Cache. The two previous countermeasures for 
data-cache can be combined with selective eviction of instruction-cache to sup- 
port dynamic code. The requirements that the kernel does not write into exe- 
cutable resources and that these are not extended are changed with the following 
property. Let C’(s, s’) be the set of executable resources in s that have not been 
written in the history of s’, joined with the resources that have been data-flushed, 
instruction-flushed, and have not been overwritten after the flushes. The inter- 
mediate invariant IT .,(s,s’) (and analogously IT em(5,3') for the cache-aware 
model) states that the translation of the program counter is in C’(s,s’), and 
when the kernel handler completes, EX (s’) C C’(s, s’). Additionally, T em(5, 3’) 
states that C’(5, 5’) is instruction-coherent. SW-I Obligation 5 holds because the 
kernel only fetches instructions from C’(5, 3’). 

The main change to the proof of SW-I Obligation 7 consists in showing 
instruction-coherency of C’(5,5’), which is ensured by the fact that data- and 
instruction-flushing a location makes it instruction-coherent (HW Obligation 4). 


5.5 Verification of a Specific Software 


Table 1 summarizes the proof obligations we identified. As the countermeasures 
are verified, three groups of proof obligations remain for a specific software: 
(1) SW-I Obligation 2.1, SW-I Obligation 3, and SW-C Obligation 1: these are 
requirements for a secure kernel independently of caches; (2) SW-I Obligation 4 
(and SW-I Obligation 8 for always-cacheability): these only constrain the con- 
figuration of the MMU; (3) SW-C Obligation 2: during the execution the kernel 
(i) stays in its code region, (ii) does not change or leave its virtual memory, 


Table 1. List of proof obligations 


Type | # Description 

HW 1 | Constraints on the MMU domain 

HW |2 | Derivability correctly overapproximates the hardware behavior 

HW |3 | Conditions ensuring that the cache-aware model behaves like the cacheless one 

HW 4 | Sufficient conditions for preserving coherency 

SW-I | 1 | Decomposition of the invariant 

SW-I | 2 | Invariant prevents direct and indirect modification of the critical resources 

SW-I |3 | Correct definition of CR and EX 

SW-I | 4 | Kernel virtual memory is cacheable and its code is in the executable resources 
The following obligations were proved for the selected countermeasures 

SW-I |5 | Correctness of the countermeasure 

SW-I |6 | Transfer of the invariants from the cacheless model to the cache-aware one 

SW-I |7 | Transfer of the countermeasure properties 

SW-C | 1 | Kernel preserves the invariant in the cacheless model 

SW-C | 2 | Kernel preserves the intermediate invariant in the cacheless model 
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(iii) preserves the countermeasure specific intermediate invariant. These must 
be verified for intermediate states of the kernel, e.g., by inlining assertions that 
guarantee (i-iii). Notice that the two code verification tasks (SW-C Obligation 1 
and SW-C Obligation 2) do not require the usage of the cache-aware model, 
enabling the usage of existing binary analysis tools. 


Case Study. As a case study, we use a hypervisor capable of hosting a Linux 
guest that has been formally verified previously on a cacheless model [21] and 
its vulnerability to cache storage channel attacks is shown in [20]. The memory 
subsystem is virtualized through direct paging. To create a page-table, a guest 
prepares it in guest memory and requests its validation. If the validation succeeds 
the hypervisor can use the page-table to configure the MMU, without requiring 
memory copy operations. The validation ensures that the page-table does not 
allow writable accesses of the guest outside the guest’s memory or to the page- 
tables. Other hypercalls allow to free and modify validated page-tables. 

Using mismatched cacheability attributes, a guest can potentially violate 
memory isolation: it prepares a valid page-table in cache and a malicious page- 
table in memory; if the hypervisor validates stale data from the cache, after 
eviction, the MMU can be made to use the malicious page-table, enabling the 
guest to violate memory isolation. We fix this vulnerability by using always 
cacheability: The guest is forced to create page-tables only inside an always 
cacheable region of memory. 

The general concepts of Sect.4.1 are easily instantiated for the hypervisor. 
Since it uses a static region of physical memory HM, the critical resources con- 
sist of HM and every memory page that is allocated to store a page-table. 
Additionally to the properties described in [21], the invariant requires that all 
page-tables are allocated in Mac and all aliases to Mac are cacheable. To guar- 
antee these properties the hypervisor code has been updated: validation of a 
page-table checks that the page resides in Mac and that all new mapping to Mac 
are cacheable; modification of a page-table forbids uncacheable aliases to Mac- 


6 Implementation 


The complete proof strategy has been implemented [2] and machine-checked 
using the HOL4 interactive theorem prover [1]. The resulting application and 
kernel integrity theorems are parametric in the countermeasure-dependent proof 
obligations. These obligations have been discharged for the selected counter- 
measures yielding theorems that depend only on code verification conditions and 
properties of the functional kernel invariant. Hardware obligations have been ver- 
ified on a single-core model consisting of a generic processor and memory inter- 
face. While the processor interface has not been instantiated yet, all assumptions 
on the memory system have been validated for an instantiation with single-level 
data- and instruction-caches using a rudimentary cache implementation. Instan- 
tiation with more realistic models is ongoing. The formal analysis took three 
person months and consists of roughly 10000 LoC for the hardware model spec- 
ification and verification of its instantiation, 2500 LoC for the integrity proof, 
and 2000 LoC for the countermeasure verification. 
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For the case study we augmented the existing hypervisor with the always 
cacheability countermeasure. This entailed some engineering effort to adapt the 
memory allocator of the Linux kernel to allocate page-tables inside Mac. The 
adaptation required changes to 45 LoC in the hypervisor and an addition of 35 
LoC in the paravirtualized Linux kernel and imposes a negligible performance 
overhead (< 1% in micro- and macro-benchmarks [20]). The HOL4 model of the 
hypervisor design has been modified to include the additional checks performed 
by the hypervisor. Similarly, we extended the invariant with the new properties 
guaranteed by the adopted countermeasure. The model has been used to show 
that the new design preserves the invariant and that all proof obligations on the 
invariant hold, which required 2000 HOL4 LoC. Verification of the augmented 
hypervisor binary is left for future work. Even if binary verification can be auto- 
mated to a large extent using binary analysis tools (e.g. [11,30]), it still requires 
a substantial engineering effort. 


7 Conclusion 


Modern hardware architectures are complex and can exhibit unforeseen vul- 
nerabilities if low level details are not properly taken into account. The cache 
storage channels of [20], as well as the recent Meltdown [26] and Spectre [28] 
attacks are examples of this problem. They shows the importance of low-level 
system details and the need of sound and tractable strategies to reason about 
them in the verification of security-critical software. 

Here we presented an approach to verify integrity-preserving countermea- 
sures in the presence of cache storage side-channels. In particular, we identified 
conditions that must be met by a security mechanism to neutralise the attack 
vector and we verified correctness of some of the existing techniques to counter 
both (instruction- and data-cache) integrity attacks. 

The countermeasures are formally modelled as new proof obligations that can 
be imposed on the cacheless model to ensure the absence of vulnerability due to 
cache storage channels. The result of this analysis are theorems in Sect. 4.3. They 
demonstrate that a software satisfying a set of proof obligations (i.e., correctly 
implementing the countermeasure) is not vulnerable because of cache storage 
channels. 

Our analysis is based on an abstract hardware model that should fit a number 
of architectures. While here we only expose two execution modes, we can support 
multiple modes of executions, where the most privileged is used by the kernel 
and all other modes are considered to be used by the application. Also our MMU 
model is general enough to cover other hardware-based protection mechanisms, 
like Memory Protection Units or TrustZone memory controllers. 

While this paper exemplifies the approach for first-level caches, our method- 
ology can be extended to accommodate more complex scenarios and other hard- 
ware features too. For instance our approach can be used to counter storage 
channels due to TLBs, multi-level caches, and multi-core processing. 

Translation Look-aside Buffers (TLBs) can be handled similarly to 
instruction-caches. Non-privileged instructions are unable to directly modify the 
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TLB and incoherent behaviours can arise only by the assistance of kernel mod- 
ifying the page-tables. Incoherent behavior can be prevented using TLB cleans 
or demonstrating that the page-tables are not changed. 

Multi-level caches can be handled iteratively in a straightforward fashion, 
starting from the cacheless model and adding CPU-closer levels of cache at each 
iteration. Iterative refinement has three benefits: Enabling the use of existing 
(cache unaware) analysis tools for verification, enabling transfer of results from 
Sects. 5.3 and 5.4 to the more complex models, and allowing to focus on each 
hardware feature independently, so at least partially counteracting the pressure 
towards ever larger and more complex global models. 

In the same way the integrity proof can be repeated for the local caches in 
a multi-core system. For shared caches the proof strategy needs to be adapted 
to take into account interleaved privileged and non-privileged steps of different 
cores, depending on the chosen verification methodology for concurrent code. 

It is also worth noting that our verification approach works for both preemp- 
tive and non-preemptive kernels, due to the use of the intermediate invariants 
II and TI that do not depend on intermediate states of kernel data structures. 

For non-privileged transitions the key tool is the derivability relation, which is 
abstract enough to fit a variety of memory systems. However, derivability has the 
underlying assumption that only uncacheable writes can bypass the cache and 
break coherency. If a given hardware allows the application to break coherency 
through other means, e.g., non-temporal store instructions or invalidate-only 
cache flushes, these cases need to be added to the derivability definition. 

The security analysis requires trustworthy models of hardware, which are 
needed to verify platform-dependent proof obligations. Some of these properties 
require extensive tests to demonstrate that corner cases are correctly handled by 
models. For example, while the conventional wisdom is that flushing caches can 
close side-channels, a new study [16] showed flushing does not sanitize caches 
thoroughly and leaves some channels active, e.g. instruction-cache attack vectors. 

There are several open questions concerning side-channels due to similar 
shared low-level hardware features such as branch prediction units, which under- 
mine the soundness of formal verification. This is an unsatisfactory situation 
since formal proofs are costly and should pay off by giving reliable guarantees. 
Moreover, the complexity of contemporary hardware is such that a verification 
approach allowing reuse of models and proofs as new hardware features are added 
is essential for formal verification in this space to be economically sustainable. 
Our results represent a first step towards giving reliable guarantees and reusable 
proofs in the presence of low level storage channels. 
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Abstract. In the inference attacks studied in Quantitative Information 
Flow (QIF), the adversary typically tries to interfere with the system in 
the attempt to increase its leakage of secret information. The defender, 
on the other hand, typically tries to decrease leakage by introducing some 
controlled noise. This noise introduction can be modeled as a type of pro- 
tocol composition, i.e., a probabilistic choice among different protocols, 
and its effect on the amount of leakage depends heavily on whether or 
not this choice is visible to the adversary. In this work we consider oper- 
ators for modeling visible and invisible choice in protocol composition, 
and we study their algebraic properties. We then formalize the interplay 
between defender and adversary in a game-theoretic framework adapted 
to the specific issues of QIF, where the payoff is information leakage. We 
consider various kinds of leakage games, depending on whether players 
act simultaneously or sequentially, and on whether or not the choices of 
the defender are visible to the adversary. Finally, we establish a hierar- 
chy of these games in terms of their information leakage, and provide 
methods for finding optimal strategies (at the points of equilibrium) for 
both attacker and defender in the various cases. 


1 Introduction 


A fundamental problem in computer security is the leakage of sensitive informa- 
tion due to correlation of secret values with observables—i.e., any information 
accessible to the attacker, such as, for instance, the system’s outputs or execu- 
tion time. The typical defense consists in reducing this correlation, which can 
be done in, essentially, two ways. The first, applicable when the correspondence 
secret-observable is deterministic, consists in coarsening the equivalence classes 
of secrets that give rise to the same observables. This can be achieved with 
post-processing, i.e., sequentially composing the original system with a program 
that removes information from observables. For example, a typical attack on 
encrypted web traffic consists on the analysis of the packets’ length, and a typi- 
cal defense consists in padding extra bits so to diminish the length variety [28]. 
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The second kind of defense, on which we focus in this work, consists in adding 
controlled noise to the observables produced by the system. This can be usually 
seen as a composition of different protocols via probabilistic choice. 


Example 1 (Differential privacy). Consider a counting query f, namely a func- 
tion that, applied to a dataset x, returns the number of individuals in x that 
satisfy a given property. A way to implement differential privacy [12] is to add 
geometrical noise to the result of f, so to obtain a probability distribution P on 
integers of the form P(z) = cel FO where c is a normalization factor. The 
resulting mechanism can be interpreted as a probabilistic choice on protocols of 
the form f(a), f(z) +1, f(a) +2,..., f(x)-1, f(a) -2,..., where the probability 


assigned to f(x) +n and to f(x) — n decreases exponentially with n. 


Example 2 (Dining cryptographers). Consider two agents running the dining 
cryptographers protocol [11], which consists in tossing a fair binary coin and 
then declaring the exclusive or ® of their secret value x and the result of the 
coin. The protocol can be thought as the fair probabilistic choice of two proto- 
cols, one consisting simply of declaring x, and the other declaring x @ 1. 


Most of the work in the literature of quantitative information flow (QIF) con- 
siders passive attacks, in which the adversary only observes the system. Notable 
exceptions are the works [4,8,21], which consider attackers who interact with 
and influence the system, possibly in an adaptive way, with the purpose of max- 
imizing the leakage of information. 


Example 8 (CRIME attack). Compression Ratio Info-leak Made Easy (CRIME) 
[25] is a security exploit against secret web cookies over connections using the 
HTTPS and SPDY protocols and data compression. The idea is that the attacker 
can inject some content a in the communication of the secret x from the target 
site to the server. The server then compresses and encrypts the data, including 
both a and x, and sends back the result. By observing the length of the result, 
the attacker can then infer information about x. To mitigate the leakage, one 
possible defense would consist in transmitting, along with zx, also an encryption 
method f selected randomly from a set F. Again, the resulting protocol can be 
seen as a composition, using probabilistic choice, of the protocols in the set F. 


In all examples above the main use of the probabilistic choice is to obfuscate 
the relation between secrets and observables, thus reducing their correlation— 
and, hence, the information leakage. To achieve this goal, it is essential that the 
attacker never comes to know the result of the choice. In the CRIME exam- 
ple, however, if f and a are chosen independently, then (in general) it is still 
better to choose f probabilistically, even if the adversary will come to know, 
afterwards, the choice of f. In fact, this is true also for the attacker: his best 
strategies (in general) are to chose a according to some probability distribution. 
Indeed, suppose that F = {f,, fo} are the defender’s choices and A = {a1, a2} 
are the attacker’s, and that f,(-,a,) leaks more than f;(-,a2), while fo(-, a1) 
leaks less than fo(-,@2). This is a scenario like the matching pennies in game 
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theory: if one player selects an action deterministically, the other player may 
exploit this choice and get an advantage. For each player the optimal strategy 
is to play probabilistically, using a distribution that maximizes his own gain for 
all possible actions of the adversary. In zero-sum games, in which the gain of 
one player coincides with the loss of the other, the optimal pair of distributions 
always exists, and it is called saddle point. It also coincides with the Nash equi- 
librium, which is defined as the point in which neither of the two players gets 
any advantage in changing unilaterally his strategy. 

Motivated by these examples, this paper investigates the two kinds of choice, 
visible and hidden (to the attacker), in a game-theoretic setting. Looking at 
them as language operators, we study their algebraic properties, which will help 
reason about their behavior in games. We consider zero-sum games, in which the 
gain (for the attacker) is represented by the leakage. While for visible choice it 
is appropriate to use the “classic” game-theoretic framework, for hidden choice 
we need to adopt the more general framework of the information leakage games 
proposed in [4]. This happens because, in contrast with standard game theory, 
in games with hidden choice the utility of a mixed strategy is a convex func- 
tion of the distribution on the defender’s pure actions, rather than simply the 
expected value of their utilities. We will consider both simultaneous games—in 
which each player chooses independently—and sequential games—in which one 
player chooses his action first. We aim at comparing all these situations, and at 
identifying the precise advantage of the hidden choice over the visible one. 

To measure leakage we use the well-known information-theoretic model. A cen- 
tral notion in this model is that of entropy, but here we use its converse, vulnerabil- 
ity, which represents the magnitude of the threat. In order to derive results as gen- 
eral as possible, we adopt the very comprehensive notion of vulnerability as any 
convex and continuous function, as used in [5,8]. This notion has been shown [5] 
to subsume most information measures, including Bayes vulnerability (aka min- 
vulnerability, aka (the converse of) Bayes risk) [10,27], Shannon entropy [26], 
guessing entropy [22], and g-vulnerability [6]. 

The main contributions of this paper are: 


— We present a general framework for reasoning about information leakage in 
a game-theoretic setting, extending the notion of information leakage games 
proposed in [4] to both simultaneous and sequential games, with either hidden 
or visible choice. 

— We present a rigorous compositional way, using visible and hidden choice 
operators, for representing adversary and defender’s actions in information 
leakage games. In particular, we study the algebraic properties of visible and 
hidden choice on channels, and compare the two kinds of choice with respect 
to the capability of reducing leakage, in presence of an adaptive attacker. 

— We provide a taxonomy of the various scenarios (simultaneous and sequential) 
showing when randomization is necessary, for either attacker or defender, 
to achieve optimality. Although it is well-known in information flow that 
the defender’s best strategy is usually randomized, only recently it has been 
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shown that when defender and adversary act simultaneously, the adversary’s 
optimal strategy also requires randomization [4]. 

— We use our framework in a detailed case study of a password-checking proto- 
col. The naive program, which checks the password bit by bit and stops when 
it finds a mismatch, is clearly very insecure, because it reveals at each attempt 
the maximum correct prefix. On the other hand, if we continue checking until 
the end of the string (time padding), the program becomes very inefficient. 
We show that, by using probabilistic choice instead, we can obtain a good 
trade-off between security and efficiency. 


Plan of the Paper. The remaining of the paper is organized as follows. In Sect. 2 
we review some basic notions of game theory and quantitative information flow. 
In Sect. 3 we introduce our running example. In Sect. 4 we define the visible and 
hidden choice operators and demonstrate their algebraic properties. In Sect. 5, 
the core of the paper, we examine various scenarios for leakage games. In Sect. 6 
we show an application of our framework to a password checker. In Sect. 7 we 
discuss related work and, finally, in Sect.8 we conclude. 


2 Preliminaries 


In this section we review some basic notions from game theory and quantitative 
information flow. We use the following notation: Given a set Z, we denote by 


DZ the set of all probability distributions over T. Given u € DZ, its support 


supp() “ef {i €T: u(i) > 0} is the set of its elements with positive probability. 


We use ip to indicate that a value į € Z is sampled from a distribution p on Z. 


2.1 Basic Concepts from Game Theory 


Two-Player Games. Two-player games are a model for reasoning about the 
behavior of two players. In a game, each player has at its disposal a set of actions 
that he can perform, and he obtains some gain or loss depending on the actions 
chosen by both players. Gains and losses are defined using a real-valued payoff 
function. Each player is assumed to be rational, i.e., his choice is driven by the 
attempt to maximize his own expected payoff. We also assume that the set of 
possible actions and the payoff functions of both players are common knowledge. 

In this paper we only consider finite games, in which the set of actions avail- 
able to the players are finite. Next we introduce an important distinction between 
simultaneous and sequential games. In the following, we will call the two players 
defender and attacker. 


Simultaneous Games. In a simultaneous game, each player chooses his action 
without knowing the action chosen by the other. The term “simultaneous” here 
does not mean that the players’ actions are chosen at the same time, but only 
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that they are chosen independently. Formally, such a game is defined as a tuple’ 
(D, A, ug, Ua), where D is a nonempty set of defender’s actions, A is a nonempty 
set of attacker’s actions, ug : DX A — R is the defender’s payoff function, and 
ua: DX A >R is the attacker’s payoff function. 

Each player may choose an action deterministically or probabilistically. A 
pure strategy of the defender (resp. attacker) is a deterministic choice of an 
action, i.e., an element d E€ D (resp. a E A). A pair (d,a) is called pure strategy 
profile, and ug(d,a), u (d,a) represent the defender’s and the attacker’s payoffs, 
respectively. A mized strategy of the defender (resp. attacker) is a probabilis- 
tic choice of an action, defined as a probability distribution 6 € DD (resp. 
a € DA). A pair (ô,a) is called mized strategy profile. The defender’s and 
the attacker’s expected payoff functions for mixed strategies are defined, respec- 
qf dð ug(d,a) = 2 aen d(d)o(a)ug(d, a) and U,(ô, a) “st 

- aE 
deô uld,a) = dap 6(d)a(a)u,(d, a). 

A defender’s mixed strategy ô € DD is a best response to an attacker’s mixed 
strategy a € DA if Ug(6,a) = maxs'epp Ug(d',a). Symmetrically, a € DA is 
a best response to ô € DD if U,(6,a) = maxa'epa Ua(ô,a'). A mized-strategy 
Nash equilibrium is a profile (6*,a*) such that 5* is the best response to a” 
and vice versa. This means that in a Nash equilibrium, no unilateral deviation 
by any single player provides better payoff to that player. If 6* and a” are 
point distributions concentrated on some d* € D and ař € A respectively, then 
(6*,a") is a pure-strategy Nash equilibrium, and will be denoted by (d*,a*). 
While not all games have a pure strategy Nash equilibrium, every finite game 
has a mixed strategy Nash equilibrium. 


tively, as: Uy(6, a) 


Sequential Games. In a sequential game players may take turns in choosing 
their actions. In this paper, we only consider the case in which each player moves 
only once, in such a way that one of the players (the leader) chooses his action 
first, and commits to it, before the other player (the follower) makes his choice. 
The follower may have total knowledge of the choice made by the leader, or 
only partial. We refer to the two scenarios by the terms perfect and imperfect 
information, respectively. 

We now give the precise definitions assuming that the leader is the defender. 
The case in which the leader is the attacker is similar. 

A defender-first sequential game with perfect information is a tuple 
(D, D > A, ug, Ua) where D, A, ug and u, are defined as in simultaneous games. 
Also the strategies of the defender (the leader) are defined as in simultane- 
ous games: an action d € D for the pure case, and a distribution 6 € DD for 
the mixed one. On the other hand, a pure strategy for the attacker is a func- 
tion sa : D—.A, which represents the fact that his choice of an action sa in A 
depends on the defender’s choice d. An attacker’s mixed strategy is a probability 


i Following the convention of security games, we set the first player to be the defender. 
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distribution ca € D(D— A) over his pure strategies.” The defender’s and the 
attacker’s expected payoff functions for mixed strategies are defined, respec- 


tively, as Ua(ð, oa) = SE aes ug(d, sa(d)) =} A 6(d)o(s,)ug(d, sa(d)) and 
Valö, oa) = Baca wald, sa(d)) = E gD | 8(d)o—(s.)ua(4, sa(d)). 
The case of imperfect information is typically formalized by assuming an 


indistinguishability (equivalence) relation over the actions chosen by the leader, 
representing a scenario in which the follower cannot distinguish between the 
actions belonging to the same equivalence class. The pure strategies of the fol- 
lowers, therefore, are functions from the set of the equivalence classes on the 
actions of the leader to his own actions. Formally, a defender-first sequential 
game with imperfect information is a tuple (D, Ka > A, ug, ua) where D, A, ug 
and u, are defined as in simultaneous games, and K, is a partition of D. The 
expected payoff functions are defined as before, except that now the argument 
of s, is the equivalence class of d. Note that in the case in which all defender’s 
actions are indistinguishable from each other at the eyes of the attacker (totally 
imperfect information), we have K, = {D} and the expected payoff functions 
coincide with those of the simultaneous games. 


Zero-sum Games and Minimax Theorem. A game (D, A, ug, ua) is zero- 
sum if for any d € D and any a E A, the defender’s loss is equivalent to the 
attacker’s gain, i.e., ug(d,a) = —u,(d,a). For brevity, in zero-sum games we 
denote by u the attacker’s payoff function ua, and by U the attacker’s expected 
payoff U> Consequently, the goal of the defender is to minimize U, and the 
goal of the attacker is to maximize it. 

In simultaneous zero-sum games the Nash equilibrium corresponds to the 
solution of the minimax problem (or equivalently, the maximin problem), 
namely, the strategy profile (5",a*) such that U(5",a*) = mins max, U (ô, a). 
The von Neumann’s minimax theorem, in fact, ensures that such solution (which 
always exists) is stable. 


Theorem 1 (von Neumann’s minimax theorem). Let ¥ c R™ and Yc R” 
be compact conver sets, and U : XxX VY > R be a continuous function such that 
U(2,y) is a convex function in x € X and a concave function in y E€ Y. Then 
minzeyMaxyey U(x, y) = maxyeyminzey U(x, y). 


? The definition of the mixed strategies as D(D > A) means that the attacker draws 
a function s, : D— A before he knows the choice of the defender. In contrast, the 
so-called behavioral strategies are defined as functions D > DA, and formalize the 
idea that the draw is made after the attacker knows such choice. In our setting, these 
two definitions are equivalent, in the sense that they yield the same payoff. 

Conventionally in game theory the payoff u is set to be that of the first player, but 
we prefer to look at the payoff from the point of view of the attacker to be in line 
with the definition of payoff as vulnerability. 
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A related property is that, under the conditions of Theorem 1, 
there exists a saddle point (x*,y*) s.t., for all z € XY and y E Y: 
U(a",y) < U(a",y*) < U(a,y"). 

The solution of the minimax problem can be obtained by using convex opti- 
mization techniques. In case U(x, y) is affine in z and in y, we can also use linear 
optimization. 

In case D and A contain two elements each, there is a closed form for the 
solution. Let D = {do,d,} and A = {ag,a;} respectively. Let u;; be the utility of 
the defender on d;,a;. Then the Nash equilibrium (6°, a”) is given by: ô” (do) = 
(w11-10)/(uoo—uo1-wio tur) and a” (ao) = (u11-u01)/(uo0-u01-u10+u11) if these values 
are in [0,1]. Note that, since there are only two elements, the strategy 6° is 
completely specified by its value in do, and analogously for a”. 


2.2 Quantitative Information Flow 


Finally, we briefly review the standard framework of quantitative information 
flow, which is concerned with measuring the amount of information leakage in a 
(computational) system. 


Secrets and Vulnerability. A secret is some piece of sensitive information the 
defender wants to protect, such as a user’s password, social security number, or 
current location. The attacker usually only has some partial knowledge about 
the value of a secret, represented as a probability distribution on secrets called 
a prior. We denote by ¥ the set of possible secrets, and we typically use 7 to 
denote a prior belonging to the set DX of probability distributions over ¥. 

The vulnerability of a secret is a measure of the utility that it represents for 
the attacker. In this paper we consider a very general notion of vulnerability, 
following [5], and we define a vulnerability V to be any continuous and convex 
function of type D¥ > R. It has been shown in [5] that these functions coincide 
with the set of g-vulnerabilities, and are, in a precise sense, the most general 
information measures w.r.t. a set of basic axioms.* 


Channels, Posterior Vulnerability, and Leakage. Computational systems can be 
modeled as information theoretic channels. A channel C: XX VY > Risa 
function in which ¥ is a set of input values, Y is a set of output values, and 
C(2x,y) represents the conditional probability of the channel producing output 
y E Y when input x € X is provided. Every channel C satisfies 0 < C(z,y) < 1 
for all x € X and y € Y, and } „ey C(z,y) = 1 for all a € X. 

A distribution 7 € DX and a channel C with inputs ¥ and outputs JY induce a 
joint distribution p(x, y) = mr(x)C(x,y) on ¥ x Y, producing joint random vari- 
ables X,Y with marginal probabilities p(x) = } „ p(x, y) and p(y) = } „p(z, y), 


t More precisely, if posterior vulnerability is defined as the expectation of the vulnera- 
bility of posterior distributions, the measure respects the data-processing inequality 
and always yields non-negative leakage iff vulnerability is convex. 
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and conditional probabilities p(z|y) = ?(#.¥)/p(y) if p(y) + 0. For a given y (s.t. 
p(y) + 0), the conditional probabilities p(x|y) for each x € ¥ form the posterior 
distribution px|y- 

A channel C in which X is a set of secret values and y is a set of observable 
values produced by a system can be used to model computations on secrets. 
Assuming the attacker has prior knowledge m about the secret value, knows 
how a channel C works, and can observe the channel’s outputs, the effect of the 
channel is to update the attacker’s knowledge from 7 to a collection of posteriors 
Px|y, each occurring with probability p(y). 

Given a vulnerability V, a prior 7, and a channel C, the posterior vulnerability 
V[1,C] is the vulnerability of the secret after the attacker has observed the 


output of the channel C. Formally: V[7,C] “st yd yey P(y)V [Px]: 

It is known from the literature [5] that the posterior vulnerability is a convex 
function of 7. Namely, for any channel C, any family of distributions {7;}, and 
any set of convex coefficients {c;}, we have: V [J cim; C] < È; c:V [m C]. 

The (information) leakage of channel C under prior m is a comparison 
between the vulnerability of the secret before the system was run—called prior 
vulnerability—and the posterior vulnerability of the secret. Leakage reflects by 
how much the observation of the system’s outputs increases the attacker’s infor- 
mation about the secret. It can be defined either additively (V [r, C] — Y [7]), 
or multiplicatively (YIr:C]/vir]). 


3 An Illustrative Example 


We introduce an example which will serve as run- 
ning example through the paper. Although admit- 
tedly contrived, this example is simple and yet pro- 
duces different leakage measures for all different 
combinations of visible/invisible choice and simul- 
taneous/sequential games, thus providing a way to 
compare all different scenarios we are interested in. 

Consider that a binary secret must be processed 
by a program. As usual, a defender wants to pro- 
tect the secret value, whereas an attacker wants to 
infer it by observing the system’s output. Assume 
the defender can choose which among two alterna- 
tive versions of the program to run. Both programs 
take the secret value x as high input, and a binary 
low input a whose value is chosen by the attacker. 
They both return the output in a low variable y.” 
Program 0 returns the binary product of x and a, 


Program 0 

High Input: x € {0,1} 
Low Input: a € {0,1} 
Output: y E {0,1} 
y=u-ra 

return y 

Program 1 

High Input: x E {0,1} 
Low Input: a € {0,1} 
Output: y E {0,1} 

c e flip coin with bias ¢/3 
if c = heads {y = x} 
else {y = x} 

return y 


Fig. 1. Running example. 


whereas Program 1 flips a coin with bias 4/3 (i.e., a coin which returns heads 


5 We adopt the usual convention in QIF of referring to secret variables, inputs and 
outputs in programs as high, and to their observable counterparts as low. 
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with probability 4/3) and returns if the result is heads, and the complement z 
of x otherwise. The two programs are represented in Fig. 1. 

The combined choices of the defender’s and of the attacker’s determine how 
the system behaves. Let D = {0,1} represent the set of the defender’s choices— 
i.e., the index of the program to use—, and A = {0,1} represent the set of 
the attacker’s choices—i.e., the value of the low input a. We shall refer to the 
elements of D and A as actions. For each possible combination of actions d € D 
and a € A, we can construct a channel Cg, modeling how the resulting system 
behaves. Each channel Cy, is a function of type ¥ x VY > R, where ¥ = {0,1} 
is the set of possible high input values for the system, and Y = {0,1} is the set 
of possible output values from the system. Intuitively, each channel provides the 
probability that the system (which was fixed by the defender) produces output 
y E Y given that the high input is x € ¥ (and that the low input was fixed by 
the attacker). The four possible channels are depicted as matrices below. 


Coo |y=O yal | Cor jy=Oy=l | Cio y =0y=1| | Cu jy=Oy=l1 
0 z=0 0 1 x=0) Y3 | 2/3 
1 x=1| 1 0 x=1| 2/3 | 1/3 


1 
x=1| 0 


z=1 


Note that channel Coo does not leak any information about the input x 
(i.e., it is non-interferent), whereas channels Cp, and Cio completely reveal x. 
Channel Cj, is an intermediate case: it leaks some information about x, but 
not all. 

We want to investigate how the defender’s and the attacker’s choices influence 
the leakage of the system. For that we can just consider the (simpler) notion of 
posterior vulnerability, since in order to make the comparison fair we need to 
assume that the prior is always the same in the various scenarios, and this 
implies that the leakage is in a one-to-one correspondence with the posterior 
vulnerability (this happens for both additive and multiplicative leakage). 

For this example, assume we are inter- 
ested in Bayes vulnerability [10,27], defined 
as V(r) = max, n(x) for every 7 € DX. 


Table 1. Vulnerability of each chan- 
nel Cy, in the running example. 


Assume for simplicity that the prior is the | V ja=O0la=1 
uniform prior 7,,. In this case we know from id =0Q| Yo 1 
[9] that the posterior Bayes vulnerability of la =i 1 | 2/3 


a channel is the sum of the greatest elements 
of each column, divided by the total number of inputs. Table 1 provides the Bayes 


vulnerability Vga ay [mu Cda ] of each channel considered above. 

Naturally, the attacker aims at maximizing the vulnerability of the system, 
while the defender tries to minimize it. The resulting vulnerability will depend 
on various factors, in particular on whether the two players make their choice 
simultaneously (i.e. without knowing the choice of the opponent) or sequentially. 
Clearly, if the choice of a player who moves first is known by an opponent who 
moves second, the opponent will be in advantage. In the above example, for 
instance, if the defender knows the choice a of the attacker, the most convenient 
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choice for him is to set d = a, and the vulnerability will be at most 2/3. Vice 
versa, if the attacker knows the choice d of the defender, the most convenient 
choice for him is to set a + d. The vulnerability in this case will be 1. 

Things become more complicated when players make choices simultaneously. 
None of the pure choices of d and a are the best for the corresponding player, 
because the vulnerability of the system depends also on the (unknown) choice 
of the other player. Yet there is a strategy leading to the best possible situation 
for both players (the Nash equilibrium), but it is mixed (i.e., probabilistic), in 
that the players randomize their choices according to some precise distribution. 

Another factor that affects vulnerability is whether or not the defender’s 
choice is known to the attacker at the moment in which he observes the output 
of the channel. Obviously, this corresponds to whether or not the attacker knows 
what channel he is observing. Both cases are plausible: naturally the defender 
has all the interest in keeping his choice (and, hence, the channel used) secret, 
since then the attack will be less effective (i.e., leakage will be smaller). On the 
other hand, the attacker may be able to identify the channel used anyway, for 
instance because the two programs have different running times. We will call 
these two cases hidden and visible choice, respectively. 

It is possible to model players’ strategies, as well as hidden and visible choices, 
as operations on channels. This means that we can look at the whole system as 
if it were a single channel, which will turn out to be useful for some proofs of our 
technical results. Next section is dedicated to the definition of these operators. 
We will calculate the exact values for our example in Sect. 5. 


4 Visible and Hidden Choice Operators on Channels 


In this section we define matrices and some basic operations on them. Since 
channels are a particular kind of matrix, we use these matrix operations to 
define the operations of visible and hidden choice among channels, and to prove 
important properties of these channel operations. 


4.1 Matrices, and Their Basic Operators 


Given two sets ¥ and Yy, a matrix is a total function of type ¥ x VY > R. 
Two matrices Mı : 4, X ¥; > R and My: X2 X Yo > R are said to be 
compatible if XY, = Xəz. If it is also the case that Yı = Yo, we say that the 
matrices have the same type. The scalar multiplication r-M between a scalar r 
and a matrix M is defined as usual, and so is the summation (Y ez Mi) (x,y) = 
M; (z,y)+...+ M; (x,y) of a family {M;};er of matrices all of a same type. 
Given a family { M; }iez of compatible matrices s.t. each M; has type ¥ x V; > 
R, their concatenation jez is the matrix having all columns of every matrix in 
the family, in such a way that every column is tagged with the matrix it came 
from. Formally, (OjerM;) (x, (y, j)) = M;(2,y), if y € Yj, and the resulting 
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matrix has type ¥ x ([|;ezVi) > R. When the family {M;} has only two 
elements we may use the binary version © of the concatenation operator. The 
following depicts the concatenation of two matrices M; and Mg in tabular form. 


Myi\y1 y2) (M2 y1 yo Y3 Mı è Mə (y1,1) (y2, 1) (y1, 2) (v2, 2) (ys, 2) 
zıl l 2\o}a, 567 = Ly 1 2 5 6 7 
t|3 4 z218 9 10 T9 3 4 8 9 10 


4.2 Channels, and Their Hidden and Visible Choice Operators 


A channel is a stochastic matrix, i.e., all elements are non-negative, and all rows 
sum up to 1. Here we will define two operators specific for channels. In the 
following, for any real value 0 < p < 1, we denote by p the value 1 — p. 


Hidden Choice. The first operator models a hidden probabilistic choice among 
channels. Consider a family {C;};ez of channels of a same type. Let u € DZ be 
a probability distribution on the elements of the index set Z. Consider an input 
x is fed to one of the channels in {C;};ez, where the channel is randomly picked 
according to u. More precisely, an index 7 € Z is sampled with probability u(i), 
then the input x is fed to channel C;, and the output y produced by the channel 
is then made visible, but not the index t of the channel that was used. Note that 
we consider hidden choice only among channels of a same type: if the sets of 
outputs were not identical, the produced output might implicitly reveal which 
channel was used. 

Formally, given a family {C;}iez of channels s.t. each C; has same type Æ x 
VY > R, the hidden choice operator beci is defined as Piel = ez u(i) Ci. 
Proposition 2. Given a family {C;};er of channels of type Xx VY > R, and a 
distribution u on T, the hidden choice Pip Ci is a channel of type VX Y > R. 


In the particular case in which the family {C;} has only two elements C;, and 
C;,, the distribution u on indexes is completely determined by a real value 0 < p < 
1s.t. u(i1) = p and p(i2) = p. In this case we may use the binary version ,® of 
the hidden choice operator: Ci, p® Ci, = p Ci, +p Ci,- The example below depicts 
the hidden choice between channels C4 and C2, with probability p = 1/3. 


Cily Yo Col Y1 Y2 Cı 1/3® Co} Y1 Ye 
z1 [Yo Yo), la, [Ys%s =| a Phs His 
zə |1/3 ?/3 gə |*/2 1/2 T2 tfa 5/9 


Visible Choice. The second operator models a visible probabilistic choice 
among channels. Consider a family {C;};ez of compatible channels. Let y € DZ 
be a probability distribution on the elements of the index set Z. Consider an 


e Liez Y: = Vi, H Vi H... U Vi, denotes the disjoint union {(y,i) | y € Vi, i € T} of 
the sets Vi, Vin, <- Vin- 
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input x is fed to one of the channels in {C;};ez, where the channel is randomly 
picked according to u. More precisely, an index ¿į € Z is sampled with probability 
u(i), then the input x is fed to channel C;, and the output y produced by the 
channel is then made visible, along with the index 7 of the channel that was used. 
Note that visible choice makes sense only between compatible channels, but it 
is not required that the output set of each channel be the same. 

Formally, given {C; }iez of compatible channels s.t. each C; has type ¥ x V; > 
R, and a distribution u on T, the visible choice operator |:| is defined as 


lhe: = Oier u(i) Ci. 


Proposition 3. Given a family {C;}ier of compatible channels s.t. each C; has 
type X X Y; > R, and a distribution u on T, the result of the visible choice 
ii Pao? is a channel of type XX (Lier) >R. 


iej 


In the particular case the family {C;} has only two elements C;, and C;,, 
the distribution on indexes is completely determined by a real value 0 < p < 1 
s.t. u(i1) = p and p(i2) = p. In this case we may use the binary version „L of 


the visible choice operator: Cj, pH Ci, = pCi, © pCi. The following depicts the 


visible choice between channels C, and C3, with probability p = 1/3. 
Cilyi Y2 C3|y1 Y3 Cı ys C3|(y1,1) (y2,1) (1,3) (y3,3) 
xı |1/2 1/2| ja | ary |1/3 2/3) = Ly 1/6 1/6 2/9 4/9 
zə |1/3 ?/3 gə |1/2 1/2 T2 1/9 Jp fs ys 


4.3 Properties of Hidden and Visible Choice Operators 


We now prove algebraic properties of channel operators. These properties will be 
useful when we model a (more complex) protocol as the composition of smaller 
channels via hidden or visible choice. 

Whereas the properties of hidden choice hold generally with equality, those 
of visible choice are subtler. For instance, visible choice is not idempotent, since 
in general C „L C + C. (In fact if C has type ¥ x Y > R, C „LI C has type 
Xx (Vu V) > R.) However, idempotency and other properties involving visible 
choice hold if we replace the notion of equality with the more relaxed notion of 
“equivalence” between channels. Intuitively, two channels are equivalent if they 
have the same input space and yield the same value of vulnerability for every 
prior and every vulnerability function. 


Definition 4 (Equivalence of channels). Two compatible channels C, and 
Cy with domain X are equivalent, denoted by Cı = Co, if for every prior 7 € DX 
and every posterior vulnerability V we have Y [r, C1] = V[7, Co]. 


Two equivalent channels are indistinguishable from the point of view of infor- 
mation leakage, and in most cases we can just identify them. Indeed, nowadays 
there is a tendency to use abstract channels [5,23], which capture exactly the 
important behavior with respect to any form of leakage. In this paper, however, 
we cannot use abstract channels because the hidden choice operator needs a 
concrete representation in order to be defined unambiguously. 
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The first properties we prove regard idempotency of operators, which can be 
used do simplify the representation of some protocols. 


Proposition 5 (Idempotency). Given a family {Ci}ier of channels s.t. C; = 
C for alli € T, and a distribution u on T, then: (a) »,_,,C; =C; and (b) 
eng: zC. 


iej 


The following properties regard the reorganization of operators, and they 
will be essential in some technical results in which we invert the order in which 
hidden and visible choice are applied in a protocol. 


Proposition 6 (“Reorganization of operators” ). Given a family {C;; }iet jez 
of channels indexed by sets T and J, a distribution u on T, and a distribution 7 
on J: 


(a) Picy Pen Oy = Pi-uCij, if all C;’s have the same type; 
jon 

(b) Eleg lljnCi = |: ]i-uCi;, if all C;’s are compatible; and 
jon 


(c) Dewy Mirae z Ele Be ji Cas if, for each i, all C;;’s have same type 


4.4 Properties of Vulnerability w.r.t. Channel Operators 


We now derive some relevant properties of vulnerability w.r.t. our channel opera- 
tors, which will be later used to obtain the Nash equilibria in information leakage 
games with different choice operations. 

The first result states that posterior vulnerability is convex w.r.t. hidden 
choice (this result was already presented in [4]), and linear w.r.t. to visible choice. 


Theorem 7. Let {C;}iez be a family of channels, and u be a distribution on T. 
Then, for every distribution 7 on X, and every vulnerability V: 


(a) posterior vulnerability is convex w.r.t. to hidden choice: v[r, > 
Vier uli) V [7, C;] if all C;’s have the same type. 

(b) posterior vulnerability is linear w.r.t. to visible choice: V[ a Hie pC | = 
Siez H(t) V [r, C;] if all C;’s are compatible. 


C;] < 


tp 


The next result is concerned with posterior vulnerability under the compo- 
sition of channels using both operators. 


Corollary 8. Let {Cj;}ietje7 be a family of channels, all with domain X and 
with the same type, and let rn € DX, and Y be any vulnerability. Define U : 
DI x DI > R as follows: U (u,n) f v[r, Picy len Ci]. Then U is convex 


on u and linear on n. 
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5 Information Leakage Games 


In this section we present our framework for reasoning about information leakage, 
extending the notion of information leakage games proposed in [4] from only 
simultaneous games with hidden choice to both simultaneous and sequential 
games, with either hidden or visible choice. 

In an information leakage game the defender tries to minimize the leakage of 
information from the system, while the attacker tries to maximize it. In this basic 
scenario, their goals are just opposite (zero-sum). Both of them can influence the 
execution and the observable behavior of the system via a specific set of actions. 
We assume players to be rational (i.e., they are able to figure out what is the 
best strategy to maximize their expected payoff), and that the set of actions and 
the payoff function are common knowledge. 

Players choose their own strategy, which in general may be mixed (i.e. prob- 
abilistic), and choose their action by a random draw according to that strategy. 
After both players have performed their actions, the system runs and produces 
some output value which is visible to the attacker and may leak some informa- 
tion about the secret. The amount of leakage constitutes the attacker’s gain, and 
the defender’s loss. 

To quantify the leakage we model the system as an information-theoretic 
channel (cf. Sect. 2.2). We recall that leakage is defined as the difference (addi- 
tive leakage) or the ratio (multiplicative leakage) between posterior and prior 
vulnerability. Since we are only interested in comparing the leakage of different 
channels for a given prior, we will define the payoff just as the posterior vulner- 
ability, as the value of prior vulnerability will be the same for every channel. 


5.1 Defining Information Leakage Games 


An (information) leakage game consists of: (1) two nonempty sets D, A of 
defender’s and attacker’s actions respectively, (2) a function C : Dx A > 
(4% x VY > R) that associates to each pair of actions (d,a) € D x Aa chan- 
nel Cga : Æ X Y > R, (3) a prior r € DX on secrets, and (4) a vulnerability 
measure V. The payoff function u : D x A —> R for pure strategies is defined as 


u(d, a) ay [, Cda]. We have only one payoff function because the game is zero- 
sum. 
Like in traditional game theory, the order of actions and the extent by which 
a player knows the move performed by the opponent play a critical role in decid- 
ing strategies and determining the payoff. In security, however, knowledge of 
the opponent’s move affects the game in yet another way: the effectiveness of 
the attack, i.e., the amount of leakage, depends crucially on whether or not the 
attacker knows what channel is being used. It is therefore convenient to distin- 
guish two phases in the leakage game: 


Phase 1: Each player determines the most convenient strategy (which in gen- 
eral is mixed) for himself, and draws his action accordingly. One of the players 
may commit first to his action, and his choice may or may not be revealed to 
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the follower. In general, knowledge of the leader’s action may help the follower 
choose a more advantageous strategy. 

Phase 2: The attacker observes the output of the selected channel Cy, and 
performs his attack on the secret. In case he knows the defender’s action, 
he is able to determine the exact channel Cda being used (since, of course, 
the attacker knows his own action), and his payoff will be the posterior vul- 
nerability V [7, Cg, ]. However, if the attacker does not know exactly which 
channel has been used, then his payoff will be smaller. 


Note that the issues raised in Phase 2 are typical of leakage games; they do 
not have a correspondence (to the best of our knowledge) in traditional game 
theory. On the other hand, these issues are central to security, as they reflect 
the principle of preventing the attacker from inferring the secret by obfuscating 
the link between secret and observables. 

Following the above discussion, we consider various possible scenarios for 
games, along two lines of classification. First, there are three possible orders for 
the two players’ actions. 


Simultaneous: The players choose (draw) their actions in parallel, each with- 
out knowing the choice of the other. 

Sequential, defender-first: The defender draws an action, and commits to it, 
before the attacker does. 

Sequential, attacker-first: The attacker draws an action, and commits to it, 
before the defender does. 


Note that these sequential games may present imperfect information (i.e., the 
follower may not know the leader’s action). 
Second, the visibility of the defender’s action during the attack may vary: 


Visible choice: The attacker knows the defender’s action when he observes the 
output of the channel, and therefore he knows which channel is being used. 
Visible choice is modeled by the operator [:]. 

Hidden choice: The attacker does not know the defender’s action when he 
observes the output of the channel, and therefore in general he does not 
exactly know which channel is used (although in some special cases he may 
infer it from the output). Hidden choice is modeled by the operator Y. 


Note that the distinction between sequential and simultaneous games is orthog- 
onal to that between visible and hidden choice. Sequential and simultaneous games 
model whether or not, respectively, the follower’s choice can be affected by knowl- 
edge of the leader’s action. This dichotomy captures how knowledge about the other 
player’s actions can help a player choose his own action. On the other hand, visi- 
ble and hidden choice capture whether or not, respectively, the attacker is able to 
fully determine the channel representing the system, once defender and attacker’s 
actions have already been fixed. This dichotomy reflects the different amounts of 
information leaked by the system as viewed by the adversary. For instance, in a 
simultaneous game neither player can choose his action based on the choice of the 
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Table 2. Kinds of games we consider. All sequential games have perfect information, 
except for game V. 


Order of action 


simultaneous defender 1% |attacker 1% 


Defender’s| Visible |] Game I Game II Game III 
choice [hidden P| Game IV Gamė V Game VI 


other. However, depending on whether or not the defender’s choice is visible, the 
adversary will or will not, respectively, be able to completely recover the channel 
used, which will affect the amount of leakage. 

If we consider also the subdivision of sequential games into perfect and imper- 
fect information, there are 10 possible different combinations. Some, however, 
make little sense. For instance, defender-first sequential game with perfect infor- 
mation (by the attacker) does not combine naturally with hidden choice ®, 
since that would mean that the attacker knows the action of the defender and 
choses his strategy accordingly, but forgets it at the moment of the attack. (We 
assume perfect recall, i.e., the players never forget what they have learned.) Yet 
other combinations are not interesting, such as the attacker-first sequential game 
with (totally) imperfect information (by the defender), since it coincides with 
the simultaneous-game case. Note that attacker and defender are not symmetric 
with respect to hiding/revealing their actions a and d, since the knowledge of a 
affects the game only in the usual sense of game theory, while the knowledge of 
d also affects the computation of the payoff (cf. “Phase 2” above). 

Table 2 lists the meaningful and interesting combinations. In Game V we 
assume imperfect information: the attacker does not know the action chosen 
by the defender. In all the other sequential games we assume that the follower 
has perfect information. In the remaining of this section, we discuss each game 
individually, using the example of Sect.3 as running example. 


Game I (simultaneous with visible choice). This simultaneous game can 
be represented by a tuple (D, A, u). As in all games with visible choice |:], 


the expected payoff U of a mixed strategy profile (6,a) is defined to be the 
def 


tds u(d,a) = 


aeg 


expected value of u, as in traditional game theory: U(é,a) = 
2 den ôld) a(a) u(d,a), where we recall that u(d,a) = V [7, Cua]. 
aE 


From Theorem 7(b) we derive: U (ô, a) = v[z Llas 3Caa |- Hence the whole 
system can be equivalently regarded as the channel [ans Cada. Still from The- 


orem 7(b) we can derive that U(d,q) is linear in 6 and a. Therefore the Nash 
equilibrium can be computed using the minimax method (cf. Sect. 2.1). 


Example 9. Consider the example of Sect. 3 in the setting of Game I. The Nash 
equilibrium (6°,a") can be obtained using the closed formula from Sect. 2.1, and 
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it is given by 6° (0) = až (0) = (2/3-1)/(a/2-1-142/s) = 2/5. The corresponding payoff 
is U(6", a”) = 2/5 2/5 1/2 + 2/5 3/5 + 3/5 2/5 + 3/5 3/5 2/3 = 4/5, 


Game II (defender 1% with visible choice). This defender-first sequential 
game can be represented by a tuple (D, D > A, u). A mixed strategy profile is 
of the form (6,03), with 6 € DD and o, E€ D(D > A), and the corresponding 


payoff is U (ô, oa) Wf E gos u(d, sa(d)) =} deD | (d) aalsa) u(d, s,(d)), where 


u(d, s,(d)) = V[7, Casa) ]- 
Again, from Theorem 7(b) we derive: U(d,0,) = vfz, [L] aes Caso | and 


hence the system can be expressed as channel [:| d-5 Cds, (a) From the same 
Sa Oa 


Theorem we also derive that U(ô,c,) is linear in 6 and o,, so the mutually 
optimal strategies can be obtained again by solving the minimax problem. In 
this case, however, the solution is particularly simple, because it is known that 
there are optimal strategies which are deterministic. Hence it is sufficient for the 
defender to find the action d which minimizes max, u(d, a). 


Example 10. Consider the example of Sect. 3 in the setting of Game II. If the 
defender chooses 0 then the attacker chooses 1. If the defender chooses 1 then 
the attacker chooses 0. In both cases, the payoff is 1. The game has therefore two 
solutions, (0,1) and (1,0). 


Game III (attacker 1°* with visible choice). This game is also a sequential 
game, but with the attacker as the leader. Therefore it can be represented as 
tuple of the form (A—D, A, u). It is the same as Game II, except that the 
roles of the attacker and the defender are inverted. In particular, the payoff 
of a mixed strategy profile (oq, Qa) E D(A >D) x DA is given by U(og,a@) “ct 


suc ouu(sa(a), a) = Ysui Dal sa) a(a) u(sg(a), a) = vz, Ls-oeCay(aye |, and 
aE 


the whole system can be equivalently regarded as channel [soa C's4(a)a» Obvi- 
ously, also in this case the minimax problem has a deterministic solution. 

In summary, in the sequential case, whether the leader is the defender or the 
attacker (Games II and III, respectively), the minimax problem has always a 
deterministic solution [24]. 


Theorem 11. In a defender-first sequential game with visible choice, there exist 
d€D anda E A such that, for every 6 E DD and o, E€ D(D > A) we have: 
U(d,o,) < u(d,a) < U(6,a). Similarly, in an attacker-first sequential game with 
visible choice, there exist d E€ D anda E A such that, for every og E D(A > D) 
and a E DA we have: U(d,a) < u(d,a) < U(og,a). 


Example 12. Consider now the example of Sect. 3 in the setting of Game III. 
If the attacker chooses 0 then the defender chooses 0 and the payoff is 1/2. If the 
attacker chooses 1 then the defender chooses 1 and the payoff is 2/3. The latter 
case is more convenient for the attacker, hence the solution of the game is the 
strategy profile (1,1). 
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Game IV (simultaneous with hidden choice). This game is a tuple 
(D,A,u). However, it is not an ordinary game in the sense that the payoff a 
mized strategy profile cannot be defined by averaging the payoff of the corre- 
sponding pure strategies. More precisely, the payoff of a mixed profile is defined 
by averaging on the strategy of the attacker, but not on that of the defender. In 
fact, when hidden choice is used, there is an additional level of uncertainty in 
the relation between the observables and the secret from the point of view of the 
attacker, since he is not sure about which channel is producing those observables. 
A mixed strategy 6 for the defender produces a convex combination of channels 
(the channels associated to the pure strategies) with the same coefficients, and 
we know from previous sections that the vulnerability is a convex function of 
the channel, and in general is not linear. 

In order to define the payoff of a mixed strategy profile (0, a), we need there- 
fore to consider the channel that the attacker perceives given his limited knowl- 
edge. Let us assume that the action that the attacker draws from a is a. He 
does not know the action of the defender, but we can assume that he knows his 
strategy (each player can derive the optimal strategy of the opponent, under the 
assumption of common knowledge and rational players). 

The channel the attacker will see is Ð d-5Ĉda; Obtaining a corresponding 
payoff of V [r, Pae5sCaa |: By averaging on the strategy of the attacker we 


obtain U(6d, a) of Geay. [7, P y5Caa| = „c4 Qla) Y [r, P ae 5 Cas |: From 
Theorem 7(b) we derive: U(d,a) = V [r,l] -a P aesCaa] and hence the whole 
system can be equivalently regarded as channel || -a Pg —sCaa- Note that, by 
Proposition 6(c), the order of the operators is interchangeable, and the system 
can be equivalently regarded as Ð ,.5|*|,<Caa- This shows the robustness of 
this model. 

From Corollary 8 we derive that U(6, a) is convex in 6 and linear in 7, hence 
we can compute the Nash equilibrium by the minimax method. 


aQ 


Example 13. Consider now the example of Sect. 3 in the setting of Game IV. 
For 6 € DD and a € DA, let p = 6(0) and q = a(0). The system can be 
represented by the channel (Coo p® Cio) gL (Coi p® C11) represented below. 


Coo p® Ciojy = Oly = 1 Cor p® Cui} y=0 y=1 
e=0 |p| gl | #=0 |fat%apPa—Yap 
r=1 1 0 x=1  |Z}/3-— 2/3 p|1/3 + 2/3 p 


For uniform m, we have V[7, Coo p® Cio] =1 — 1/2; and Y [7r, Cro p® Cı] 
is equal to 2/3 — 2/3 p if p < 1/4, and equal to 1/3 + 2/3 p if p > 1/4. Hence the 
payoff, expressed in terms of p and q, is U (p,q) = q(1 — 1/2) + (2/3 — 2/3 p) if 
p < 1/4, and U (p,q) = q(1 — 1/2) + (1/3 + 2/3 p) if p > 1/4. The Nash equilibrium 
(p*,q*) is given by p* = argmin, max, U (p,q) and q = argmax, min, U (p,q), 
and by solving the above, we obtain p* = q° = 4/7. 


Game V (defender 1% with hidden choice). This is a defender-first sequen- 
tial game with imperfect information, hence it can be represented as a tuple of 
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the form (D, Ka > A, ug, ua), where K, is a partition of D. Since we are assum- 
ing perfect recall, and the attacker does not know anything about the action 
chosen by the defender in Phase 2, i.e., at the moment of the attack (except the 
probability distribution determined by his strategy), we must assume that the 
attacker does not know anything in Phase 1 either. Hence the indistinguishabil- 
ity relation must be total, i.e., Ka = {D}. But {D} > A is equivalent to A, hence 
this kind of game is equivalent to Game IV. 

It is also a well known fact in Game theory that when in a sequential game 
the follower does not know the leader’s move before making his choice, the game 
is equivalent to a simultaneous game. 


Game VI (attacker 1°* with hidden choice). This game is also a sequen- 
tial game with the attacker as the leader, hence it is a tuple of the form 
(A-D, A, u). It is similar to Game III, except that the payoff is convex on 
the strategy of the defender, instead of linear. The payoff of the mixed strategy 


= Bas aV [n Enue] m 


profile (og,a) E D(A >D) x DA is U(og,a) = 
y[r, Paca Heca uel: so the whole system can be equivalently regarded 


as channel Paoa Llay- oCsila)a: Also in this case the minimax problem has a 
deterministic solution, but only for the attacker. 


Theorem 14. In an attacker-first sequential game with hidden choice, there 
exist a € A and 6 E DD such that, for every a € DA and og E D(A > D) we 
have that U(6,a) < U(6,a) < U(og,a). 


Example 15. Consider again the example of Sect.3, this time in the setting 
of Game VI. Consider also the calculations made in Example 13, we will use 
the same results and notation here. In this setting, the attacker is obliged to 
make its choice first. If he chooses 0, which corresponds to committing to the 
system Coo p® Cio, then the defender will choose p = 1/4, which minimizes its 
vulnerability. If he chooses 1, which corresponds to committing to the system 
Cor p® C11, the defender will choose p = 1, which minimizes its vulnerability 
of the above channel. In both cases, the leakage is p = 1/2, hence both these 
strategies are solutions to the minimax. Note that in the first case the strategy 
of the defender is mixed, while that of the attacker is always pure. 


5.2 Comparing the Games 


If we look at the various payoffs obtained for the running example in the 
various games, we obtain the following values (listed in decreasing order): 
I :1; 1: 4/5; II : 2/3; IV :4/7; V 24/7; VI : 1/2. 


However, one could argue that, since the defender has already committed, the 
attacker does not need to perform the action corresponding to the Nash equilib- 
rium, any payoff-maximizing solution would be equally good for him. 


Leakage and Protocol Composition in a Game-Theoretic Perspective 153 


This order is not accidental: for any vulnerabil- Il 
ity function, and for any prior, the various games are 
ordered, with respect to the payoff, as shown in Fig. 2. a 
The relations between II, I, and III, and between IV- Tl IV=V 
. oe ae 
V and VI come from the fact that, in any zero-sum VI 


sequential game the leader’s payoff will be less or equal 

to his payoff in the corresponding simultaneous game. Fig. 2. Order of games 
We think this result is well-known in game theory, but w.r.t. payoff. Games 
we give the hint of the proof nevertheless, for the sake higher in the lattice 
of clarity. have larger payoff. 


Theorem 16. It is the case that: 


(a) mins maxs, V E |] aes Caso | > mins max, V |x, | Ja-s Caa] 
Saa aca 


= Maxa Mino, V |7, Llsscos Cssta)a | 


(b) mins maxa V [7, blica P 6C aa | > max, min,, V |7, Paca lets e | 


Proof. We prove the first inequality in (a). Independently of ô, consider the 
attacker strategy 7, that assigns probability 1 to the function s, defined as 
s,(d) = argmax,V[7, Caa]. Then we have that 


IV 


min max V T, |: | Cas, (d) min V T, |: | Cas, (d) 
2 d- d- 


Sa Oa Ta 


IV 


min max V| 7, l Cda 
ô Qa 
d- 
asg 
Note that the strategy 7, is optimal for the adversary, so the first of the above 
inequalities is actually an equality. All other cases can be proved with an anal- 
ogous reasoning. oO 


Concerning III and IV-V: these are not related. In the running example the 
payoff for III is higher than for IV-V, but it is easy to find other cases in which 
the situation is reversed. For instance, if in the running example we set C11 to 
be the same as Cog, the payoff for III will be 1/2, and that for IV-V will be 2/s. 

Finally, the relation between III and VI comes from the fact that they are 
both attacker-first sequential games, and the only difference is the way in which 
the payoff is defined. Then, just observe that in general we have, for every a € A 
and every 6 € DD: V[7, Pg 5Caa] < V[7, l]a sCaa]: 

The relations in Fig.2 can be used by the defender as guidelines to better 
protect the system, if he has some control over the rules of the game. Obviously, 
for the defender the games lower in the ordering are to be preferred. 
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6 Case Study: A Safer, Faster Password-Checker 


In this section we apply our game-theoretic, 
compositional approach to show how a 
defender can mitigate an attacker’s typical 
timing side-channel attack while avoiding 
the usual burden imposed on the password- 
checker’s efficiency. 

Consider the password-checker PWD123 of 


Program PWD,.3 
High Input: x € {000, 001, 
Low Input: a € {000, 001,. 
Output: y € {T, F} 
accept := T 
for i = 1,2,3 do 

if a; + x; then 


Fig. 3, which performs a bitwise-check of accept := F 
a 3-bit low-input a = a,a,a3, provided by break 

the attacker, against a 3-bit secret pass- end if 

word £ = £1£2£3. The low-input is rejected end for 


as soon as it mismatches the secret, and is 


return accept 


...,111} 
..,111} 


accepted otherwise. 

The attacker can choose low-inputs to 
try to gain information about the password. 
Obviously, in case PWD123 accepts the low-input, the attacker learns the password 
value is a = x. Yet, even when the low-input is rejected, there is some leakage of 
information: from the duration of the execution the attacker can estimate how 
many iterations have been performed before the low-input was rejected, thus 
inferring a prefix of the secret password. 

To model this scenario, let ¥ = {000,001,...,111} be the set of all possible 
3-bit passwords, and Y = {(F,1), (F,2), (F,3), (7,3)} be the set of observables 
produced by the system. Each observable is an ordered pair whose first element 
indicates whether the password was accepted (T or F), and the second element 
indicates the duration of the computation (1, 2, or 3 iterations). For instance, 
channel C123,101 in Fig.4 models PWDj23’s behavior when the attacker provides 
low-input a = 101. 

We will adopt as a measure of information Bayes vulnerability [27]. The prior 
Bayes vulnerability of a distribution 7 € DX is defined as Vy [7] = maxzex Tx, 
and represents the probability that the attacker guesses correctly the password 
in one try. For instance, if the distribution on all possible 3-bit passwords is 
î = (0.0137, 0.0548, 0.2191, 0.4382, 0.0002, 0.0002, 0.0548, 0.2191), its prior Bayes 
vulnerability is V[7] = 0.4382. 

The posterior Bayes vulnerability of a prior 7 and a channel C:¥ x YOR is 
defined as V[7,C]= Ley MaXze¥ TC (x,y), and it represents the probability 
that the attacker guesses correctly the password in one try, after he observes 
the output of the channel (i.e., after he has measured the time needed for the 
checker to accept or reject the low-input). For prior 7 above, the posterior Bayes 
vulnerability of channel C23 101 is V [ 7, C193,101 | = 0.6577 (which represents an 
increase in Bayes vulnerability of about 50%), and the expected running time 
for this checker is of 1.2747 iterations. 

A way to mitigate this timing side-channel is to make the checker’s execution 
time independent of the secret. Channel Ceons,101 from Fig. 4 models a checker 
that does that (by eliminating the break command within the loop in PWD123) 


Fig. 3. Password-checker algorithm. 
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when the attacker’s low-input is a = 101. This channel’s posterior Bayes vulner- 
ability is V [ 7, C13,101 | = 0.4384, which brings the multiplicative Bayes leakage 
down to an increase of only about 0.05%. However, the expected running time 
goes up to 3 iterations (an increase of about 135% w.r.t. that of C123,101)- 
Seeking some compromise 


i i =) 9] | w=) we =| y= 

between security and efficiency, |©223.101|( 7, 1)|(F, 2)|(,3)|(T,3)| |Ceons2911CF, 3)|(T, 3) 
assume that the defender | 77-990} 1 | © | 0 | 0 #000") 1 | 0 
z=001 | 1 0 0 0 x=001 | 1 0 
can employ password-checkers | z=010 | 1 0 0 0 r=010 | 1 0 
that perform the bitwise com- | 7=011| 1 | O |O | 0 æ=011 | 1 0 
à : x=100 | 0 0 1 0 x=100 | 1 0 
parison among low-input a | z=101 | 0 0 0 1 x=101 | 0 1 
and secret password x in | 2=110] 0 1 0j 0 GALLO: A 0 
z=111 | 0 1 0 0 w=111 | 1 0 


different orders. More pre- 
cisely, there is one version of 
the checker for every possi- 
ble order in which the index 
i ranges in the control of the 
loop. For instance, while PWD123 checks the bits in the order 1, 2,3, the alterna- 
tive algorithm PWD23; uses the order 2,3, 1. 

To determine a defender’s best choice of which versions of the checker 
to run, we model this problem as game. The attacker’s actions A = 
{000, 001,...,111} are all possible low-inputs to the checker, and the defender’s 
D = {123, 132,213, 231,312,321} are all orders to perform the comparison. 
Hence, there is a total of 48 possible channels Cgg:¥ X Y—-R, one for each 
combination of d E€ D, a € A. 

In our frame- 
work, the utility 


Fig.4. Channels Cg, modeling the password 
checker for defender’s action d and attacker’s 
action a. 


Table 3. Utility for each pure strategy profile. 


f ized. strat- Attacker’s action a 

or a mixed stra U(d,a)]] 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 

egy profile (ô, a) is 123 |]0.7257|0.7257|0.9311|0.9311|0.6577|0.6577|0.7122|0.7122 
: De ores j 132 ||0.8900|0.9311|0.8900|0.9311|0.7122|0.7122|0.7122|0.7122 

given by U(ô,a) = 213 _|[0.5068]0.5068]0.9311|0.9311|0.4934|0.4934|0.7668|0.7668 
Zara V [m, Ea Caa]. 231 |[0.5068|0.5068|0.766810.9311|0.5068|0.5068|0.766810.9311 

312 /0.7257]0.9311|0.7257|0.9311|0.7122|0.8766]0.7122|0.8766 
For each pure strat- 
egy profile (d,a), 


321 _|{0.6712|0.7122|0.7257|0.9311|0.6712|0.7122|0.7257|0.9311 
the payoff of the game will be the posterior Bayes vulnerability of the resulting 
channel Cga (since, if we measuring leakage, the prior vulnerability is the same 
for every channel once the prior is fixed). Table3 depicts such payoffs. Note 
that the attacker’s and defender’s actions substantially affect the effectiveness of 
the attack: vulnerability ranges between 0.4934 and 0.9311 (and so multiplicative 
leakage is in the range between an increase of 12% and one of 112%). Using tech- 
niques from [4], we can compute the best (mixed) strategy for the defender in this 
game, which turns out to be 6” = (0.1667, 0.1667, 0.1667, 0.1667, 0.1667, 0.1667). 
This strategy is part of an equilibrium and guarantees that for any choice of the 
attacker the posterior Bayes vulnerability is at most 0.6573 (so the multiplica- 
tive leakage is bounded by 50%, an intermediate value between the minimum of 


about 12% and the maximum of about 112%). It is interesting to note that the 
expected running time, for any action of the attacker, is bounded by at most 


Defender’s 
action d 
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2.3922 iterations (an increase of only 87% w.r.t. the channel PWD123), which is 
below the worst possible expected 3 iterations of the constant-time password 
checker. 


7 Related Work 


Many studies have applied game theory to analyses of security and privacy in 
networks [3,7,14], cryptography [15], anonymity [1], location privacy [13], and 
intrusion detection [30], to cite a few. See [20] for a survey. 

In the context of quantitative information flow, most works consider only 
passive attackers. Boreale and Pampaloni [8] consider adaptive attackers, but 
not adaptive defenders, and show that in this case the adversary’s optimal strat- 
egy can be always deterministic. Mardziel et al. [21] propose a model for both 
adaptive attackers and defenders, but in none of their extensive case-studies the 
attacker needs a probabilistic strategy to maximize leakage. In this paper we 
characterize when randomization is necessary, for either attacker or defender, to 
achieve optimality in our general information leakage games. 

Security games have been employed to model and analyze payoffs between 
interacting agents, especially between a defender and an attacker. Korzhyk et al. 
[19] theoretically analyze security games and study the relationships between 
Stackelberg and Nash Equilibria under various forms of imperfect information. 
Khouzani and Malacaria [18] study leakage properties when perfect secrecy is 
not achievable due to constraints on the allowable size of the conflating sets, 
and provide universally optimal strategies for a wide class of entropy measures, 
and for g-entropies. These works, contrarily to ours, do not consider games with 
hidden choice, in which optimal strategies differ from traditional game-theory. 

Several security games have modeled leakage when the sensitive informa- 
tion are the defender’s choices themselves, rather than a system’s high input. 
For instance, Alon et al. [2] propose zero-sum games in which a defender chooses 
probabilities of secrets and an attacker chooses and learns some of the defender’s 
secrets. Then they present how the leakage on the defender’s secrets gives influ- 
ences on the defender’s optimal strategy. More recently, Xu et al. [29] show 
zero-sum games in which the attacker obtains partial knowledge on the security 
resources that the defender protects, and provide the defender’s optimal strategy 
under the attacker’s such knowledge. 

Regarding channel operators, sequential and parallel composition of channels 
have been studied (e.g., [17]), but we are unaware of any explicit definition 
and investigation of hidden and visible choice operators. Although Kawamoto 
et al. [16] implicitly use the hidden choice to model a probabilistic system as the 
weighted sum of systems, they do not derive the set of algebraic properties we 
do for this operator, and for its interaction with the visible choice operator. 
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8 Conclusion and Future Work 


In this paper we used protocol composition to model the introduction of noise 
performed by the defender to prevent leakage of sensitive information. More 
precisely, we formalized visible and hidden probabilistic choices of different pro- 
tocols. We then formalized the interplay between defender and adversary in a 
game-theoretic framework adapted to the specific issues of QIF, where the payoff 
is information leakage. We considered various kinds of leakage games, depending 
on whether players act simultaneously or sequentially, and whether the choices 
of the defender are visible or not to the adversary. We established a hierarchy 
of these games, and provided methods for finding the optimal strategies (at the 
points of equilibrium) in the various cases. 

As future research, we would like to extend leakage games to the case of 
repeated observations, i.e., when the attacker can observe the outcomes of the 
system in successive runs, under the assumption that both attacker and defender 
may change the channel in each run. We would also like to extend our frame- 
work to non zero-sum games, in which the costs of attack and defense are not 
equivalent, and to analyze differentially-private mechanisms. 
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Abstract. Recently, many tools have been proposed for automatically 
analysing, in symbolic models, equivalence of security protocols. Equiv- 
alence is a property needed to state privacy properties or game-based 
properties like strong secrecy. Tools for a bounded number of sessions 
can decide equivalence but typically suffer from efficiency issues. Tools 
for an unbounded number of sessions like Tamarin or ProVerif prove a 
stronger notion of equivalence (diff-equivalence) that does not properly 
handle protocols with else branches. 

Building upon a recent approach, we propose a type system for rea- 
soning about branching protocols and dynamic keys. We prove our type 
system to entail equivalence, for all the standard primitives. Our type 
system has been implemented and shows a significant speedup compared 
to the tools for a bounded number of sessions, and compares similarly 
to ProVerif for an unbounded number of sessions. Moreover, we can also 
prove security of protocols that require a mix of bounded and unbounded 
number of sessions, which ProVerif cannot properly handle. 


1 Introduction 


Formal methods provide a rigorous and convenient framework for analysing secu- 
rity protocols. In particular, mature push-button analysis tools have emerged 
and have been successfully applied to many protocols from the literature in the 
context of trace properties such as authentication or confidentiality. These tools 
employ a variety of analysis techniques, such as model checking (e.g., Avispa [6] 
and Scyther [31]), Horn clause resolution (e.g., ProVerif [13]), term rewriting 
(e.g., Scyther [31] and Tamarin [38]), and type systems [7, 12, 16-21, 34, 36,37]. 
In the recent years, attention has been given also to equivalence properties, 
which are crucial to model privacy properties such as vote privacy [8,33], unlik- 
ability [5], or anonymity [9]. For example, consider an authentication protocol 
Prass embedded in a biometric passport. Ppass preserves anonymity of pass- 
port holders if an attacker cannot distinguish an execution with Alice from an 
execution with Bob. This can be expressed by the equivalence Ppass( Alice) ~ 
Ppass( Bob). Equivalence is also used to express properties closer to cryptographic 
games like strong secrecy. 
© The Author(s) 2018 
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Two main classes of tools have been developed for equivalence. First, in the 
case of an unbounded number of sessions (when the protocol is executed arbitrar- 
ily many times), equivalence is undecidable. Instead, the tools ProVerif [13, 15] 
and Tamarin [11,38] try to prove a stronger property, namely diff-equivalence, 
that may be too strong e.g. in the context of voting. Tamarin covers a larger class 
of protocols but may require some guidance from the user. Maude-NPA [35, 40] 
also proves diff-equivalence but may have non-termination issues. Another class 
of tools aim at deciding equivalence, for bounded number of sessions. This is the 
case in particular of SPEC [32], APTE [23], Akiss [22], and SatEquiv [26]. SPEC, 
APTE, and Akiss suffer from efficiency issues and can typically not handle more 
than 3—4 sessions. SatEquiv is much more efficient but is limited to symmetric 
encryption and requires protocols to be well-typed, which often assumes some 
additional tagging of the protocol. 


Our Contribution. Following the approach of [28], we propose a novel technique 
for proving equivalence properties for a bounded number of sessions as well as an 
unbounded number of sessions (or a mix of both), based on typing. [28] proposes 
a first type system that entails trace equivalence P ~; Q, provided protocols 
use fixed (long-term) keys, identical in P and Q. In this paper, we target a 
larger class of protocols, that includes in particular key-exchange protocols and 
protocols whose security relies on branching on the secret. This is the case e.g. 
of the private authentication protocol [3], where agent B returns a true answer 
to A, encrypted with A’s public key if A is one of his friends, and sends a decoy 
message (encrypted with a dummy key) otherwise. 

We devise a new type system for reasoning about keys. In particular, we 
introduce bikeys to cover behaviours where keys in P differ from the keys in Q. 
We design new typing rules to reason about protocols that may branch differently 
(in P and Q), depending on the input. Following the approach of [28], our type 
system collects sent messages into constraints that are required to be consistent. 
Intuitively, the type system guarantees that any execution of P can be matched 
by an execution of Q, while consistency imposes that the resulting sequences 
of messages are indistinguishable for an attacker. We had to entirely revisit the 
approach of [28] and prove a finer invariant in order to cope with the case where 
keys are used as variables. Specifically, most of the rules for encryption, signature, 
and decryption had to be adapted to accommodate the flexible usage of keys. 
For messages, we had to modify the rules for keys and encryption, in order to 
encrypt messages with keys of different type (bi-key type), instead of only fixed 
keys. We show that our type system entails equivalence for the standard notion 
of trace equivalence [24] and we devise a procedure for proving consistency. This 
yields an efficient approach for proving equivalence of protocols for a bounded 
and an unbounded number of sessions (or a combination of both). 

We implemented a prototype of our type-checker that we evaluate on a set of 
examples, that includes private authentication, the BAC protocol (of the biomet- 
ric passport), as well as Helios together with the setup phase. Our tool requires a 
light type annotation that specifies which keys and names are likely to be secret 
or public and the form of the messages encrypted by a given key. This can be 
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easily inferred from the structure of the protocol. Our type-checker outperforms 
even the most efficient existing tools for a bounded number of sessions by two 
(for examples with few processes) to three (for examples with more processes) 
orders of magnitude. Note however that these tools decide equivalence while 
our type system is incomplete. In the case of an unbounded number of sessions, 
on our examples, the performance is comparable to ProVerif, one of the most 
popular tools. We consider in particular vote privacy in the Helios protocol, in 
the case of a dishonest ballot board, with no revote (as the protocol is insecure 
otherwise). ProVerif fails to handle this case as it cannot (faithfully) consider 
a mix of bounded and unbounded number of sessions. Compared to [28], our 
analysis includes the setup phase (where voters receive the election key), which 
could not be considered before. 

The technical details and proofs omitted due to space constraints are available 
in the companion technical report [29]. 


2 High-Level Description 


2.1 Background 


Trace equivalence of two processes is a property that guarantees that an attacker 
observing the execution of either of the two processes cannot decide which one it 
is. Previous work [28] has shown how trace equivalence can be proved statically 
using a type system combined with a constraint checking procedure. The type 
system consists of typing rules of the form I'A P ~ Q — C, meaning that in 
an environment I’ two processes P and Q are equivalent if the produced set of 
constraints C, encoding the attacker observables, is consistent. 

The typing environment I" is a mapping from nonces, keys, and variables to 
types. Nonces are assigned security labels with a confidentiality and an integrity 
component, e.g. HL for high confidentiality and low integrity. Key types are of 
the form key!(T) where l is the security label of the key and T is the type of the 
payload. Key types are crucial to convey typing information from one process to 
another one. Normally, we cannot make any assumptions about values received 
from the network — they might possibly originate from the attacker. If we however 
successfully decrypt a message using a secret symmetric key, we know that the 
result is of the key’s payload type. This is enforced on the sender side, whenever 
outputting an encryption. 

A core assumption of virtually any efficient static analysis for equivalence is 
uniform execution, meaning that the two processes of interest always take the 
same branch in a branching statement. For instance, this means that all decryp- 
tions must always succeed or fail equally in the two processes. For this reason, 
previous work introduced a restriction to allow only encryption and decryption 
with keys whose equality could be statically proved. 


2.2 Limitation 


There are however protocols that require non-uniform execution for a proof of 
trace equivalence, e.g., the private authentication protocol [3]. The protocol aims 
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) ( 

) (HL) authentication succeeded on the left, failed on the right 
I'(k, ke) = key™ (HL) authentication succeeded on the right, failed on the left 

) (HL) authentication succeeded on both sides 

) (HL) authentication failed on both sides 


Fig. 1. Key types for the private authentication protocol 


at authenticating B to A, anonymously w.r.t. other agents. More specifically, 
agent B may refuse to communicate with agent A but a third agent D should 
not learn whether B declines communication with A or not. The protocol can be 
informally described as follows, where pk(k) denotes the public key associated 
to key k, and aenc(M, pk(k)) denotes the asymmetric encryption of message M 
with this public key. 


AB: aenc((Na, pk(ka)), Pk(ko)) 
aenc( (Na, (Ny, pk(ky))), pk(ka)) if B accepts A’s request 


BoA: ; i 
aenc(N,, pk(k)) if B declines A’s request 


If B declines to communicate with A, he sends a decoy message 
aenc(N,, pk(k)) where pk(k) is a decoy key (no one knows the private key k). 


2.3 Encrypting with Different Keys 


Let Pa(ka,pk(ky)) model agent A willing to talk with B, and P,(ky,pk(ka)) 
model agent B willing to talk with A (and declining requests from other agents). 
We model the protocol as: 


P,(ka, pkp) = new Na.out(aenc((Na, pk(ka)), pkp)). in(z) 
Py(kp, pka) = new Np. in(x). 
let y = adec(x, kp) in let yı = 71(y) in let y2 = mo(y) in 
if yo = pka then 
out (aene (lun (Nb, pk(ks))), Pha) 
else out(aenc(Np, pk(k))) 


where adec(M, k) denotes asymmetric decryption of message M with private 
key k. We model anonymity as the following equivalence, intuitively stating that 
an attacker should not be able to tell whether B accepts requests from the agent 
Aor C: 


Palka, pk(ko)) | Po(ko,pk(Ka)) ~i Palka, pk(ko)) | Polke, pk(he)) 


We now show how we can type the protocol in order to show trace equiva- 
lence. The initiator P, is trivially executing uniformly, since it does not contain 
any branching operations. We hence focus on typing the responder P,. 

The beginning of the responder protocol can be typed using standard tech- 
niques. Then however, we perform the test y2 = pk(ka) on the left side and 
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y2 = pk(k-) on the right side. Since we cannot statically determine the result 
of the two equality checks — and thus guarantee uniform execution — we have 
to typecheck the four possible combinations of then and else branches. This 
means we have to typecheck outputs of encryptions that use different keys on 
the left and the right side. 

To deal with this we do not assign types to single keys, but rather to pairs of 
keys (k, k’) — which we call bikeys — where k is the key used in the left process 
and k’ is the key used in the right process. The key types used for typing are 
presented in Fig. 1. 

As an example, we consider the combination of the then branch on the 
left with the else branch on the right. This combination occurs when A is 
successfully authenticated on the left side, while being rejected on the right side. 
We then have to typecheck B’s positive answer together with the decoy message: 
I = aenc((y1, (No, pk(ko))), pk(ka)) ~ aenc(N,, pk(k)) : LL. For this we need the 
type for the bikey (ka, k). 


2.4  Decrypting Non-uniformly 


When decrypting a ciphertext that was potentially generated using two different 
keys on the left and the right side, we have to take all possibilities into account. 
Consider the following extension of the process P, where agent A decrypts B’s 
message. 


Palka, pkp) = new N,.out(aenc((Na, pk(ka)), pkp)). in(z). 
let z’ = adec(z, ka) in out(1) 
else out(0) 


In the decryption, there are the following possible cases: 


— The message is a valid encryption supplied by the attacker (using the public 
key pk(ka)), so we check the then branch on both sides with T(z’) = LL. 

— The message is not a valid encryption supplied by the attacker so we check 
the else branch on both sides. 

— The message is a valid response from B. The keys used on the left and the 
right are then one of the four possible combinations (ka, k), (Ka, ke), (k, ke) 
and (k, k). 

e In the first two cases the decryption will succeed on the left and fail on 
the right. We hence check the then branch on the left with T(z’) = HL 
with the else branch on the right. If the type I (ka, k) were different from 
IT (ka, ke), we would check this combination twice, using the two different 
payload types. 

e In the remaining two cases the decryption will fail on both sides. We hence 
would have to check the two else branches (which however we already 
did). 


While checking the then branch together with the else branch, we have to 
check 7 F 1 ~ 0: LL, which rightly fails, as the protocol does not guarantee 
trace equivalence. 
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3 Model 


In symbolic models, security protocols are typically modelled as processes of a 
process algebra, such as the applied pi-calculus [2]. We present here a calculus 
used in [28] and inspired from the calculus underlying the ProVerif tool [14]. This 
section is mostly an excerpt of [28], recalled here for the sake of completeness, 
and illustrated with the private authentication protocol. 


3.1 Terms 


Messages are modelled as terms. We assume an infinite set of names M for nonces, 
further partitioned into the set FN of free nonces (created by the attacker) and 
the set BN of bound nonces (created by the protocol parties), an infinite set of 
names XK for keys similarly split into FK and BK, and an infinite set of variables 
Y. Cryptographic primitives are modelled through a signature F, that is, a set 
of function symbols, given with their arity (i.e. the number of arguments). Here, 
we consider the following signature: 


F. = {pk, vk, enc, aenc, sign, (-,-), h} 


that models respectively public and verification key, symmetric and asymmetric 
encryption, concatenation and hash. The companion primitives (symmetric and 
asymmetric decryption, signature check, and projections) are represented by the 
following signature: 


Fa = {dec, adec, checksign, 71, 712} 


We also consider a set C of (public) constants (used as agent names for instance). 
Given a signature F, a set of names M, and a set of variables V, the set of terms 
T(F,V,N) is the set inductively defined by applying functions to variables in V 
and names in M. We denote by names(t) (resp. vars(t)) the set of names (resp. 
variables) occurring in t. A term is ground if it does not contain variables. 

We consider the set 7 (Fe U FaUC,V,N UK) of cryptographic terms, simply 
called terms. Messages are terms with constructors from T(F,UC,V,N UK). 
We assume the set of variables to be split into two subsets V = ¥ WAX where 
X are variables used in processes while AX are variables used to store messages. 
An attacker term is a term from T (Fe U Fa UC, AX, FN U FK). In particular, 
an attacker term cannot use nonces and keys created by the protocol’s parties. 

A substitution o = {Mı/z1,..., Mk/£k} is a mapping from variables 
X1,-.-,Lp E V to messages Mj,,..., Mp. We let dom(c) = {z1,..., £k}. We 
say that ø is ground if all messages M4, ..., Mp are ground. We let names(c) = 
U,<;<, names(M;). The application of a substitution ø to a term t is denoted 
to and is defined as usual. 
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The evaluation of a term t, denoted t |, corresponds to the bottom-up appli- 
cation of the cryptographic primitives and is recursively defined as follows. 


uļ=u ifue NUVUKUC 

pk(t) | = pk(t |) ift|eEk 

vk(t) | = vk(t |) ift|ek 

h(t) | = h(t J) ift] L 
(ti te) | = (tı l,t2 l) if tı [A L and tg 1# L 
enc(tı, t2) | = enc(tı |,t |) if tı JÆ L and t2 JE K 
sign(tı, t2) | = sign(tı |, te |) if tı JÆ L and t2 JE K 

aenc(t1, t2) | = aenc(tı |, to |) if tı |A L and t2 |= pk(k) 


for some k € K 
m(t) | = ti if t |= (ti, te) 
m2(t) | = te if t |= (ti, te) 
dec(ty, t2) = t3 if ti J= enc(ts, t4) and t4 = to l 

adec(t1, t2) | = t3 if tı |= aenc(t3, pk(t4)) and t4 = tə | 
checksign(tı, t2) | = tz if tı |= sign(t3,t4) and tg |= vk(t4) 

t | = L otherwise 

Note that the evaluation of term t succeeds only if the underlying keys are atomic 
and always returns a message or L. For example we have 71((a,b)) |= a, while 
dec(enc(a, (b, b)), (b,6)) |= L, because the key is non atomic. We write t =, t’ 
ift l=? |. 


Destructors used in processes: 
d ::= dec(a, t) | adec(a, t) | checksign(a, t’) | m1 (a) | 72(x) 
where x € X,t EKUX,t € {vk(k)|kK EK}UX. 
Processes: 
P,Q ::= 0 | new n.P | out(M).P | in(x).P | (P | Q) | 1P 
| let x = d in P else Q | if M = N then P else Q 


where n € BN U BK, x € X, and M, N are messages. 


Fig. 2. Syntax for processes. 


3.2 Processes 


Security protocols describe how messages should be exchanged between partic- 
ipants. We model them through a process algebra, whose syntax is displayed 
in Fig. 2. We identify processes up to a-renaming, i.e., avoiding substitution of 
bound names and variables, which are defined as usual. Furthermore, we assume 
that all bound names, keys, and variables in the process are distinct. 
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A configuration of the system is a tuple (P; ġ; o) where: 


— P is a multiset of processes that represents the current active processes; 

— ¢ is a substitution with dom(¢) C A¥ and for any xz € dom(¢), (x) (also 
denoted x@) is a message that only contains variables in dom(c). ¢ represents 
the terms that have been sent; 

— ø is a ground substitution. 


The semantics of processes is given through a transition relation +, defined 
in Fig. 3 (r denotes a silent action). The relation >, is defined as the reflexive 
transitive closure of +, where w is the concatenation of all actions. We also 
write equality up to silent actions =,. 

Intuitively, process new n.P creates a fresh nonce or key, and behaves like 
P. Process out(M).P emits M and behaves like P, provided that the evalua- 
tion of M is successful. The corresponding message is stored in the frame 4, 
corresponding to the attacker knowledge. A process may input any message 
that an attacker can forge (rule IN) from her knowledge ¢, using a recipe R 
to compute a new message from ¢. Note that all names are initially assumed 
to be secret. Process P | Q corresponds to the parallel composition of P and 
Q. Process let x = d in P else Q behaves like P in which x is replaced 
by d if d can be successfully evaluated and behaves like Q otherwise. Process 
if M = N then P else Q behaves like P if M and N correspond to two equal 
messages and behaves like Q otherwise. The replicated process !P behaves as an 
unbounded number of copies of P. 

A trace of a process P is any possible sequence of transitions in the presence 
of an attacker that may read, forge, and send messages. Formally, the set of 
traces trace(P) is defined as follows. 


trace(P) = {(w, ¢,0)|({P}; 0:0) = (P:d;0)} 


Example 1. Consider the private authentication protocol (PA) presented in 
Sect. 2. The process P,(kp, pk(ka)) corresponding to responder B answering a 
request from A has already been defined in Sect. 2.3. The process P, (ka, pk(ko)) 
corresponding A willing to talk to B is: 


P, (ka, pkb) = new Nz.out(aenc((Na, pk(ka)), pkp)). in(z) 
Altogether, a session between A and B is represented by the process: 
Palka; pk(ko)) | Poko, pk(ka)) 


where ka, ky E€ BK, which models that the attacker initially does not know kg, kp. 
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({Pi| P} UP; ġo) > ({Pi, P2} UP; ¢;0) PAR 

({O} UP;¢;0) —> (P;¢;c) ZERO 

({newn.P}UP;¢;0) —> ({P}UP;¢;c) NEW 

({new k.P} UP;¢;0) > ({P}UP;¢;0) NEWKEY 

({out(t).P} U P; ¢; g) Tee sen men) U P}UP; oU {t/aan}; o) OUT 
if to is a ground term, (to) |A L,azn € AX andn = |¢| + 1 

({in(x).P} U P; 6:0) =EL QP} U P; ¢;0 U {(Rġo) | /z}) IN 


if R is an attacker term such that vars( R) C dom(¢), 
and(R¢a) |A L 
({let z = d in P else Q}UP;¢;0) — ({P}UP;¢;cU{(do) | /x}) LET-IN 
if do is ground and (do) |A L 


({let x = d in P else Q}UP;¢;0) — ({Q}UP;¢;0 LET-ELSE 
if do is ground and (do) |= L, i.e. d fails 

({if M =N then P else Q}UP;¢;0) — ({P}UP; 50 IF-THEN 
if M, N are messages such that Mo, No are ground, 


(Mo) 1# L, (No) |# L, and Mo = No 
({if M = N then P else Q}UP;¢;0) —> ({Q}UP;¢;0 IF-ELSE 
if M, N are messages such that Mo, No are ground 
and (Mo) |= L or (No) |= Lor Mo 4 No 
({!P}UP;¢;0) +> ({P,!P}UP;¢;¢) REPL 


Fig. 3. Semantics 


An example of a trace describing an “honest” execution, where the attacker 
does not interfere with the intended run of the protocol, can be written as (tr, ¢) 
where 

tr =, new £.0ut(x ).in(x)).new rg.0ut(x2).in(x2) 


and 


@ = {x1 > aenc( (Na, pk(ka)), pk(ko)), £2 > aenc((Na, (No, pk(ko))), Pk(Ka))}- 


The trace tr describes A outputting the first message of the protocol, which is 
stored in ¢(#1). The attacker then simply forwards ¢(2,) to B. B then performs 
several silent actions (decrypting the message, comparing its content to pk(ka)), 
and outputs a response, which is stored in ¢(a2) and forwarded to A by the 
attacker. 


l x= LL|HL| HH 
KT ::= key’ (T) | eqkey!(T) | seskey!“(T) with a € {1, 00} 
T z= l[TeT |TV T| [rh?; r}®] witha € {1,00} 


| KT | pkey(KT) | vkey(KT) | (T) 7 | {T}r 


Fig. 4. Types for terms 
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3.3 Equivalence 


When processes evolve, sent messages are stored in a substitution @ while the 
values of variables are stored in ø. A frame is simply a substitution 7 where 
dom(7) C AX. It represents the knowledge of an attacker. In what follows, we 
will typically consider ġo. 

Intuitively, two sequences of messages are indistinguishable to an attacker 
if he cannot perform any test that could distinguish them. This is typically 
modelled as static equivalence [2]. Here, we consider of variant of [2] where the 
attacker is also given the ability to observe when the evaluation of a term fails, 
as defined for example in [25]. 


Definition 1 (Static Equivalence). Two ground frames ¢ and @¢ are stati- 
cally equivalent if and only if they have the same domain, and for all attacker 
terms R,S with variables in dom(¢) = dom(¢’), we have 


(Ro =, Sẹ) (RØ =, S¢’) 


Then two processes P and Q are in equivalence if no matter how the adversary 
interacts with P, a similar interaction may happen with Q, with equivalent 
resulting frames. 


Definition 2 (Trace Equivalence). Let P, Q be two processes. We write P C; 
Q if for all (s,¢,0) € trace(P), there exists (s',ġ',o') E€ trace(Q) such that 
s =, s’ and ġo and ¢'0' are statically equivalent. We say that P and Q are 
trace equivalent, and we write P ~; Q, if PE: Q and Q G P. 


Note that this definition already includes the attacker’s behaviour, since pro- 
cesses may input any message forged by the attacker. 


Example 2. As explained in Sect.2, anonymity is modelled as an equivalence 
property. Intuitively, an attacker should not be able to know which agents are 
executing the protocol. In the case of protocol PA, presented in Example 1, the 
anonymity property can be modelled by the following equivalence: 


Pa(ka;pk(ko)) | Po(ko,pk(ka)) ~i Palka, pk(ks)) | Poko, pk(Ke)) 


4 A Type System for Dynamic Keys 


Types. In our type system we give types to pairs of messages — one from the 
left process and one from the right one. We store the types of nonces, variables, 
and keys in a typing environment I’. While we store a type for a single nonce 
or variable occurring in both processes, we assign a potentially different type to 
every different combination of keys (k, k’) used in the left and right process — so 
called bikeys. This is an important non-standard feature that enables us to type 
protocols using different encryption and decryption keys. 

The types for messages are defined in Fig. 4 and explained below. Selected 
subtyping rules are given in Fig. 5. We assume three security labels HH, HL and LL, 
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(SEQKEY) (SSESKEyY) 


eqkey' (T) <: key'(T) seskey™" (T) <: eqkey' (T) 


T <: eqkey'(T’) T <: eqkey' (T’) 


——_—_——. (SKEY SPUBKEY SVKEY 
key (T) <: 1l ( ) pkey(T) <: LL ( ) vkey(T) <: LL ( ) 
TT T<: T 
—— ~ (SENc) _—W¥W¥—___—_ (SAENC) 
(LD) pu <: (2B) pn {Th pn <: {T fon 


Fig. 5. Selected subtyping rules 


ranged over by l, whose first (resp. second) component denotes the confidentiality 
(resp. integrity) level. Intuitively, values of high confidentiality may never be 
output to the network in plain, and values of high integrity are guaranteed 
not to originate from the attacker. Pair types T x» T” describe the type of their 
components and the type T V T” is given to messages that can have type T or 
type T”. 

The type 7/7 describes nonces and constants of security level l: the label a 
ranges over {co, 1}, denoting whether the nonce is bound within a replication or 
not (constants are always typed with a = 1). We assume a different identifier n 
for each constant and restriction in the process. The type tb! is populated by a 
single name, (i.e., n describes a constant or a non-replicated nonce) and T} is 
a special type, that is instantiated to ie in the jth replication of the process. 


Type e rT" 


m || is a refinement type that restricts the set of possible values of 
a message to values of type 7% on the left and type TL0 on the right. For a 
refinement type [7/:*; 74] with equal types on both sides we write 7/;*. 

Keys can have three different types ranged over by KT, ordered by a subtyping 
relation (SEQKEY, SSESKEY): seskey'*(T) <: eqkey'(T) <: key'(T). For all 
three types, l denotes the security label (SKEyY) of the key and T is the type of 
the payload that can be encrypted or signed with these keys. This allows us to 
transfer typing information from one process to another one: e.g. when encrypting, 
we check that the payload type is respected, so that we can be sure to get a value 
of the payload type upon decryption. The three different types encode different 
relations between the left and the right component of a bikey (k, k’). While type 
key! (T) can be given to bikeys with different components k Æ k’, type eqkey!(T) 
ensures that the keys are equal on both sides in the specific typed instruction. 
Type seskey'*(T) additionally guarantees that the key is always the same on the 
left and the right throughout the whole process. We allow for dynamic generation 
of keys of type seskey"*(T) and use a label a to denote whether the key is generated 
under replication or not — just like for nonce types. 

For a key of type T, we use types pkey(T) and vkey(T) for the correspond- 
ing public key and verification key, and types (T’), and {T'}p for symmetric 
and asymmetric encryptions of messages of type T’ with this key. Public keys 
and verification keys can be treated as LL if the corresponding keys are equal 
(SPUBKEY, SVKEY) and subtyping on encryptions is directly induced by sub- 
typing of the payload types (SENC, SAENC) (Fig. 6). 
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T(n) =r} P(m)=73" Le {HH,HL n=," 
te al Co Beat { } CpNonce) et (TNONCEL) 
Trnem:1—9 Tirnen:LL—O 
I'(«)=T TEMAN:T >c TT 
(TVAR) (TSuB) 
Tranvae:TH0O TrFMAN:Tc 
TEKMAN:Toe TRMAN':T' oe 
/ 1 / / (TPAIR) 
DE(M,M’) ~ (N,N): T*T > cUc 
M,N well formed 
(THIGH) 
Th}M~N:HL=9 
(kk) =T k € keys(I1) U FK 
( ) (TKEY) ysl) (TPUBKEYL) 
Thk~k:T-0O I} pk(k) ~ pk(k) : LL —> 0 
PEM*~N:T30  3T',LT <:key'(T’) 
(TPUBKEY) 
D+ pk(M) ~ pk(N) : pkey(T) > Ø 
PEMSN:Toe TEM AN':T' oe 
T'=LL V (3T”, T", LT’ = pkey(T”) AT” <: key'(T” 
( pkey(T') y (T )) (TAENC) 


T H aenc(M, M’) ~ aenc(N, N’) : {T}r > cU cd 

TEMAN {TT n>c T <:key™(T 
(Tokey) ) OTARNCH) 

reM~NN:IL=>cU{M~ N} 


CTKM~N:{LL},—-c (T = pkey(T’) AT’ <: eqkey'(T”)) or T = LL 
TrFM~N:lLL-ec 


(TAENCL) 


Fig. 6. Selected rules for messages 


Constraints. When typing messages, we generate constraints of the form 
(M ~ N), meaning that the attacker may see M and N in the left and right 
process, respectively, and these two messages are thus required to be indistin- 
guishable. 

Due to space reasons we only present a few selected rules that are character- 
istic of the typing of branching protocols. The omitted rules are similar in spirit 
to the presented ones or are standard rules for equivalence typing [28]. 


4.1 Typing Messages 


The typing judgement for messages is of the form l H M ~ N : T — c which 
reads as follows: under the environment I’, M and N are of type T and either this 
is a high confidentiality type (i.e., M and N are not disclosed to the attacker) or 
M and N are indistinguishable for the attacker assuming the set of constraints 
c is consistent. 

Confidential nonces can be given their label from the typing environment 
in rule TNONCE. Since their label prevents them from being released in clear, 
the attacker cannot observe them and we do not need to add constraints for 
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them. They can however be output in encrypted form and will then appear in 
the constraints of the encryption. Public nonces (labeled as LL) can be typed if 
they are equal on both sides (rule TNONCEL). These are standard rules, as well 
as the rules TVAR, TSuB, TPAIR and THIGH [28]. 

A non-standard rule that is crucial for the typing of branching protocols is 
rule TKEY. As the typing environment contains types for bikeys (k, k’) this rule 
allows us to type two potentially different keys with their type from the environ- 
ment. With the standard rule TPUBKEYL we can only type a public key of the 
same keys on both sides, while rule TPUBKEYy allows us to type different public 
keys pk( M), pk(V), provided we can show that there exists a valid key type for 
the terms M and N. This highlights another important technical contribution 
of this work, as compared to existing type systems for equivalence: we do not 
only support a fixed set of keys, but also allow for the usage of keys in variables, 
that have been received from the network. 

To show that a message is of type {7}, — a message of type T encrypted 
asymmetrically with a key of type T’, we have to show that the corresponding 
terms have exactly these types in rule TAENC. The generated constraints are 
simply propagated. In addition we need to show that T’ is a valid type for a 
public key, or LL, which models untrusted keys received from the network. Note, 
that this rule allows us to encrypt messages with different keys in the two pro- 
cesses. For encryptions with honest keys (label HH) we can use rule TAENC to 
give type LL to the messages, if we can show that the payload type is respected. 
In this case we add the entire encryptions to the constraints, since the attacker 
can check different encryptions for equality, even if he cannot obtain the plain- 
text. Rule TAENCL allows us to give type LL to encryptions even if we do not 
respect the payload type, or if the key is corrupted. However, we then have to 
type the plaintexts with type LL since we cannot guarantee their confidential- 
ity. Additionally, we have to ensure that the same key is used in both processes, 
because the attacker might possess the corresponding private keys and test which 
decryption succeeds. Since we already add constraints for giving type LL to the 
plaintext, we do not need to add any additional constraints. 


4.2 Typing Processes 


From now on, we assume that processes assign a type to freshly generated nonces 
and keys. That is, new n.P is now of the form new n : T. P. This requires a (very 
light) type annotation from the user. The typing judgement for processes is of 
the form I’ P ~ Q — C and can be interpreted as follows: If two processes 
P and Q can be typed in I’ and if the generated constraint set C is consistent, 
then P and Q are trace equivalent. We present selected rules in Fig. 7. 

Rule POUT states that we can output messages to the network if we can 
type them with type LL, i.e., they are indistinguishable to the attacker, pro- 
vided that the generated set c of constraints is consistent. The constraints 
of c are then added to all constraints in the constraint set C. We define 
CUyd := {(cU e, T) | (e,r) € C}. This rule, as well as the rules PZERO, PIN, 
PNEw, PPAR, and PLET, are standard rules [28]. 
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FrFP~+Q-C FEM~N:LL>c 
IF out(M).P ~ out(N).Q > CUye 
I’ © T does not contain union types Tigi LL Pw C 
uli (PZERO) = (PIN) 
rr-o~o— (0,r) IF in(x).P ~ in(£).Q > C 
Tn: PxQaC 
Denewn: 7)*.P wnewn: tQ 3 C 
T, (k,k) : seskey” (T) P ~Q >C 
T H new k : seskey!*(T).P ~ new k : seskey’’*(T).Q > C 
TEP,Q3C TEP NQ >O 
rE P|P ~Q/|Q—CuxcC’ 
Tkat~t':T I,x:TFKP~Q3>C PEP NR ac 
Te lets=tinPelse P’~letx=?t' inQ else Q > CUC" 
(PLETADECSAME) 


(POuT) 


(PNEW) 


(PNEWKEY) 


(PPAR) 


(PLET) 


T(y)=LL  T(k,k) <: key™(T) 

IL æx:TF PĒ > C re: 1LF-P =Q 0 PEP eg =o" 
(YT NK Ak. T(k,k') <: key ™(T') > Pia: TH P ~Q > Cyr) 
(YT NK Ak. T(k',k) <: key ™ (T) > Pie: TH P~ Q —> Cy) 

ITF let xz = adec(y, k) in P else P’ ~ let x = adec(y, k) in Q else Q’ 

= CuCu u (Uru (Ue) 
k! 


k! 


rA-P~ĒxQRQ>C 
FPN >C rP ~Q-C3 PEP ~ag >O 
[bh if M= M' then P else P’ ~ if N = N' then Q else Q’ 
=> C1 U C2 U C3 U C4 


(PIFALL) 


Fig. 7. Selected rules for processes 


Rule PNEWKEY allows us to generate new session keys at runtime, which 
models security protocols more faithfully. It also allows us to generate infinitely 
many keys, by introducing new keys under replication. 

Rule PLETADECSAME treats asymmetric decryptions where we use the same 
fixed honest key (label HH) for decryptions in both processes. Standard type sys- 
tems for equivalence have a simplifying (and restrictive) invariant that guaran- 
tees that encryptions are always performed using the same keys in both pro- 
cesses and hence guarantee that both processes always take the same branch in 
decryption (compare rule PLET). In our system however, we allow encryptions 
with potentially different keys, which requires cross-case validation in order to 
retain soundness. Still, the number of possible combinations of encryption keys 
is limited by the assignments in the typing environment I’. To cover all the 
possibilities, we type the following combinations of continuation processes: 
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T (k, k) <: key (T) T(z) =LL 


(DADECL) 
IF aadec(x,k) ~ adec(x, k) : LL 


T (y) = seskey™" (T) T(x) 
I Haadec(z, y) ~ adec(x,y):T V LL 


(T (y) = seskey™" (T) v T(y) = LL) T(x) =LL 
Ib aadec(ax, y) ~ adec(zx, y): LL 
T(k, k) = seskey'"“(T") I(x) = {T } pkey (seskeyte (T) 
I FHaadec(zx, k) ~ adec(x, k): T 


Ty) = seskey’’*(T") I(x) = {T} pkey (seskey!:4(T’)) 
IF aadec(x,y) ~ adec(z,y): T 


(DADECL’) 


(DADECT) 


(DADECT’) 


Fig. 8. Selected destructor rules 


— Both then branches: In this case we know that key k was used for encryption 
on both sides. Because of T (k, k) = key™(T), we know that in this case the 
payload type is T and we type the continuation with Ia: T. 

Because the message may also originate from the attacker (who also has access 
to the public key), we have to type the two then branches also with T, x : LL. 

— Both else branches: If decryption fails on both sides, we type the two else 
branches without introducing any new variables. 

— Left then, right else: The encryption may have been created with key k on 
the left side and another key k’ on the right side. Hence, for each k’ Æ k, such 
that [’(k, k’) maps to a key type with label HH and payload type T’, we have 
to typecheck the left then branch and the right else branch with Ix: T”. 

— Left else, right then: This case is analogous to the previous one. 


The generated set of constraints is simply the union of all generated constraints 
for the subprocesses. Rule PIFALL lets us typecheck any conditional by simply 
checking the four possible branch combinations. In contrast to the other rules 
for conditionals that we present in a companion technical report, this rule does 
not require any other preconditions or checks on the terms M, M’, N, N’. 


Destructor Rules. The rule PLET requires that a destructor application succeeds 
or fails equally in the two processes. To ensure this property, it relies on addi- 
tional rules for destructors. We present selected rules in Fig. 8. Rule DADECL 
is a standard rule that states that a decryption of a variable of type LL with an 
untrusted key (label LL) yields a result of type LL. Decryption with a trusted 
(label HH) session key gives us a value of the key’s payload type or type LL in 
case the encryption was created by the attacker using the public key. Here it 
is important that the key is of type seskey"’*(T), since this guarantees that 
the key is never used in combination with a different key and hence decryption 
will always equally succeed or fail in both processes. Rule DADECL’ is similar to 
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(y1, (No, pk(ke))), Np well formed 


*= THIGH 

DE (y1, (No, pk(ko))) ~ No : HL — 0 

T (ka, k) = key" (HL 

( 4 (HL) TKEY 
IF ka~ k: key” (HL) > 0 
= TPUBKEY 
* I F pk(ka) ~ pk(k) : pkey(key” (HL)) > Ø ne 
I F aenc((yi, (No, pk(ko))), pk(ka)) ~ aenc( Ns, pk(k)) : {HL} rey (keytt(Ht)) eect 


I} aenc((y1, (No, pk(ko))), pk(ka)) ~ aenc(Nz, pk(k)) : LL + C 
where C = {aenc((yi, (No, pk(ky))), pk(ka)) ~ aenc( No, pk(k)) }. 


Fig. 9. Type derivation for the response to A and the decoy message 


rule DADECL except it uses a variable for decryption instead of a fixed key. Rule 
DADECT treats the case in which we know that the variable x is an asymmetric 
encryption of a specific type. If the type of the key used for decryption matches 
the key type used for encryption, we know the exact type of the result of a suc- 
cessful decryption. DADECT” is similar to DADECT, with a variable as key. In 
a companion technical report we present similar rules for symmetric decryption 
and verification of signatures. 


4.3 Typing the Private Authentication Protocol 


We now show how our type system can be applied to type the Private Authen- 
tication protocol presented in Sect. 2.3, by showing the most interesting parts of 
the derivation. We type the protocol using the initial environment I" presented 
in Fig. 1. 

We focus on the responder process P, and start with the asymmetric decryp- 
tion. As we use the same key ky in both processes, we apply rule PLETADEC- 
SAME. We have I(x) = LL by rule PIN and T (kp, ky) = key™(HH, LL). We do 
not have any other entry using key k, in I’. We hence typecheck the two then 
branches once with T,y : (HH * LL) and once with T,y : LL, as well as the two 
else branches (which are just 0 in this case). 

Typing the let expressions is straightforward using rule PLET. In the con- 
ditional we check y2 = pk(kq) in the left process and y2 = pk(k,) in the right 
process. Since we cannot guarantee which branches are taken or even if the same 
branch is taken in the two processes, we use rule PIFALL to typecheck all four 
possible combinations of branches. We now focus on the case where A is success- 
fully authenticated in the left process and is rejected in the right process. We 
then have to typecheck B’s positive answer together with the decoy message: 
PF aenc( (y, (No, pk(ko))), pk(ka)) ~ aenc( Ne, pk(k)) : LL. 

Figure 9 presents the type derivation for this example. We apply rule TAENC 
to give type LL to the two terms, adding the two encryptions to the constraint set. 
Using rule TAENCH we can show that the encryptions are well-typed with type 
{HL} pkey (key! (HL))* The type of the payload is trivially shown with rule THIGH. 
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To type the public key, we use rule TPUBKEy followed by rule TKEyY, which 
looks up the type for the bikey (ka, k) in the typing environment I’. 


5 Consistency 


Our type system collects constraints that intuitively correspond to (symbolic) 
messages that the attacker may see (or deduce). Therefore, two processes are in 
trace equivalence only if the collected constraints are in static equivalence for 
any plausible instantiation. 

However, checking static equivalence of symbolic frames for any instantia- 
tion corresponding to a real execution may be as hard as checking trace equiva- 
lence [24]. Conversely, checking static equivalence for any instantiation may be 
too strong and may prevent proving equivalence of processes. Instead, we use 
again the typing information gathered by our type system and we consider only 
instantiations that comply with the type. Actually, we even restrict our attention 
to instantiations where variables of type LL are only replaced by deducible terms. 
This last part is a key ingredient for considering processes with dynamic keys. 
Hence, we define a constraint to be consistent if the corresponding two frames 
are in static equivalence for any instantiation that can be typed and produces 
constraints that are included in the original constraint. 

Formally, we first introduce the following ingredients: 


— ¢e(c) and ¢,(c) denote the frames that are composed of the left and the right 
terms of the constraints respectively (in the same order). 

— @;, denotes the frame that is composed of all low confidentiality nonces and 
keys in I’, as well as all public encryption keys and verification keys in T. 
This intuitively corresponds to the initial knowledge of the attacker. 

— Two ground substitutions g, øo’ are well-typed in I’ with constraint co if they 
preserve the types for variables in I, i.e., for alla, I A a(x) ~ o' (x): T(x) > 
Cr, and co = Uzedom(r) Cu: 


The instantiation of a constraint is defined as expected. If c is a set of constraints, 
and ø, o’ are two substitutions, let [c],,, be the instantiation of c by ø on the 
left and o’ on the right, that is, [c], o = {Mo ~ N |M ~N Ec}. 


Definition 3 (Consistency). A set of constraints c is consistent in an envi- 
ronment I if for all substitutions o, o' well-typed in I with a constraint co such 
that co C [c] o, the frames ¢4,U de(c)o and ¢4,U ¢,(c)o’ are statically equiva- 
lent. We say that (c, I`) is consistent if c is consistent in T and that a constraint 
set C is consistent in T if each element (c, T) € C is consistent. 


Compared to [28], we now require c, C [c],,,. This means that instead of 
considering any (well typed) instantiations, we only consider instantiations that 
use fragments of the constraints. For example, this now imposes that low vari- 
ables are instantiated by terms deducible from the constraint. This refinement 
of consistency provides a tighter definition and is needed for non fixed keys, as 
explained in the next section. 
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6 Soundness 


In this section, we provide our main results. First, soundness of our type system: 
whenever two processes can be typed with consistent constraints, then they are 
in trace equivalence. Then we show how to automatically prove consistency. 
Finally, we explain how to lift these two first results from finite processes to 
processes with replication. But first, we discuss why we cannot directly apply 
the results from [28] developed for processes with long term keys. 


6.1 Example 


Consider the following example, typical for a key-exchange protocol: Alice 
receives some key and uses it to encrypt, e.g. a nonce. Here, we consider a 
semi-honest session, where an honest agent A is receiving a key from a dishon- 
est agent D. Such sessions are typically considered in combination with honest 
sessions. 

C — A: aenc((k, C), pk(A)) 

A— C : aenc(n, k) 


The process modelling the role of Alice is as follows. 


Ps = in(x). let z’ = adec(x,k,) in let y= m(x’) in let z = m(x’) in 
if z = C then newn. out(enc(n, y)) 


When type-checking Pa ~ P4 (as part as a more general process with honest 
sessions), we would collect the constraint enc(n,y) ~ enc(n, y) where y comes 
from the adversary and is therefore a low variable (that is, of type LL). The app- 
roach of [28] consisted in opening messages as much as possible. In this example, 
this would yield the constraint y ~ y which typically renders the constraint 
inconsistent, as exemplified below. 

When typechecking the private authentication protocol, we obtain con- 
straints containing aenc((y1, (Ny, pk(ky))), pk(ka)) ~ aenc(Np, pk(k)) (as seen 
in Fig.9), where yı has type HL. Assume now that the constraint also contains 
y ~ y for some variable y of type LL and consider the following instantiations 
of y and yı: o(yı) = o/(y1) = a for some constant a and o(y) = o'(y) = 
aenc(N,,pk(k)). Note that such an instantiation complies with the type since 
I F o(y) ~ o'(y) : LL —> c for some constraint c. The instantiated constraint 
would then contain 


{aenc((a, (Np, pk(ky))), pk(ka)) ~ aenc( Ny, pk(k)), 
aenc( Np, pk(k)) ~ aenc(Np, pk(k))} 


and the corresponding frames are not statically equivalent, which makes the 
constraint inconsistent for the consistency definition of [28]. 

Therefore, our first idea consists in proving that we only collect constraints 
that are saturated w.r.t. deduction: any deducible subterm can already be con- 
structed from the terms of the constraint. Second, we show that for any exe- 
cution, low variables are instantiated by terms deducible from the constraints. 
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This guarantees that our new notion of consistency is sound. The two results are 
reflected in the next section. 


6.2  Soundness 
Our type system, together with consistency, implies trace equivalence. 


Theorem 1 (Typing implies trace equivalence). For all P, Q, and C, for 
all I’ containing only keys, if IT F P ~ Q — C and C is consistent, then P ~ Q. 


Example 3. We can typecheck PA, that is 
DF Palka, pk(ky)) | Po(ko,pk(ka)) ~ Palka, pk(ko)) | Po(ko, pk(kc)) > Cra 


where I" has been defined in Fig. 1 and assuming that nonce Na of process P, 
has been annotated with type T™1 and nonce N, of P, has been annotated 


with type ce The constraint set Cpa can be proved to be consistent using the 
procedure presented in the next section. Therefore, we can conclude that 


Palka, pk(ko)) | Polko, pk(ka)) =t Palka, pk(ko)) | Po(kv, pk(ke)) 
which shows anonymity of the private authentication protocol. 


The first key ingredient in the proof of Theorem 1 is the fact that any well- 
typed low term is deducible from the constraint generated when typing it. 


Lemma 1 (Low terms are recipes on their constraints). For all ground 
messages M, N, for al T, c, if T FA M ~N: LL — c then there exists an 
attacker recipe R without destructors such that M = R(¢¢(c) U oh) and N = 
R(by(c) U piz). 

The second key ingredient is a finer invariant on protocol executions: for 
any typable pair of processes P,Q, any execution of P can be mimicked by 
an execution of Q such that low variables are instantiated by well-typed terms 
constructible from the constraint. 


Lemma 2. For all processes P, Q, for all ¢, o, for all multisets of processes 
P, constraint sets C, sequences s of actions, for all I containing only keys, if 
rH P~ Q> CO,C is consistent, and ({P},0,0) =., (P,¢,0), then there 
exist a sequence s’ of actions, a multiset Q, a frame ¢', a substitution o’, an 
environment I”, a constraint c such that: 

- ({Q}, 0,0) =, (Q,¢',0'), with s =, 8! 

- I" + ġo ~ o'o : LL > c, and for all x € dom(c) N dom(o’), there exists cy 
such that I” + a(x) ~ o(a) : T'(x)—> cy and cy C c. 


Note that this finer invariant guarantees that we can restrict our attention 
to the instantiations considered for defining consistency. 

As a by-product, we obtain a finer type system for equivalence, even for 
processes with long term keys (as in [28]). For example, we can now prove equiv- 
alence of processes where some agent signs a low message that comes from the 
adversary. In such a case, we collect sign(#,k) ~ sign(x,k) in the constraint, 
where x has type LL, which we can now prove to be consistent (depending on 
how zx is used in the rest of the constraint). 
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6.3 Procedure for Consistency 


We devise a procedure check const(C) for checking consistency of a con- 
straint C, depicted in Fig. 10. Compared to [28], the procedure is actually simpli- 
fied. Thanks to Lemmas 1 and 2, there is no need to open constraints anymore. 
The rest is very similar and works as follows: 


~ First, variables of refined type [r41 ; 7/1] are replaced by m on the left-hand- 
side of the constraint and n on the right-hand-side. 

— Second, we check that terms have the same shape (encryption, signature, 
hash) on the left and on the right and that asymmetric encryption and hashes 
cannot be reconstructed by the adversary (that is, they contain some fresh 
nonce). 

— The most important step consists in checking that the terms on the left satisfy 
the same equalities than the ones on the right. Whenever two left terms M 
and N are unifiable, their corresponding right terms M’ and N’ should be 
equal after applying a similar instantiation. 


For constraint sets without infinite nonce types, check_const entails consis- 
tency. 


Theorem 2. Let C be a set of constraints such that 
V(e, T) € C.V1,U,m,p. P(x) A [rb ; a]. 
If check_const(C) = true, then C is consistent. 


Example 4. Continuing Example 3, typechecking the PA protocol yields the set 
Cp, of constraint sets. Cp, contains in particular the set 


{aenc((Na,pk(ka)), pk(ko)) ~ aenc( (Na, pk(ka)), pk(ko)), 
aenc((y1, (No, pk(ko))), pk(ka)) ~ aenc(No, pk(k))} 


where variable yı has type HL (we also have the same constraint but where 
yı has type LL). The other constraint sets of Cp, are similar and correspond 
to the various cases (else branch of P, with then branch of P,, etc.). The 
procedure check_const returns true since no two terms can be unified, which 
proves consistency. Similarly, the other constraints generated for PA can be 
proved to be consistent applying check_const. 


6.4 From Finite to Replicated Processes 


The previous results apply to processes without replication only. In the spirit 
of [28], we lift our results to replicated processes. We proceed in two steps. 


1. Whenever 1+ P ~ Q — C, we show that: 
[(],U---U[L],F[P]]---1PP], ~ hl I] > [C],Ux« ++ Ux[C],; 
where [|T ]; is intuitively a copy of I’, where variables x have been replaced 
by zi, and nonces or keys n of infinite type T} (or seskey""°°(T)) have been 
replaced by n;. The copies [ P ];, [Q];, and [C ]; are defined similarly. 
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stept,(c) := (depo T) with 


F := {x € dom(T) | Im, n, L, V. P(x) = [rh ; rey 
and or, o'p defined by 


e dom(or) = dom(o;) = F 
e yx € F.Vm,n,l,U P(x) = [rts 1] > or(z) =m A op(x) =n 


and I” is P'|aom(r)\r extended with I” (n) = 74" for all nonce n such that 7/)! occurs in 
T. 


step2p(c) := check that for all M ~ N € c, M and N are both 


— enc(M', M”), enc(N’, N”) where M”, N” are either 
e keys k, k’ where 3T.I (k, k’) <: key™(T); 
e or a variable x such that 3T. I (x) <: key™(T); 
— or encryptions aenc(M’, M”), aenc(N’, N”) where 
e M’ and N’ contain directly under pairs a nonce n such that I (n) = rh®°® ora 
secret key k such that IT, k’. I (k, k') <: key™(T) or T(k', k) <: key™(T), or 
a variable x such that 3m, n,a. (x) = [7'* ; 7H*], or a variable x such that 
IT.I (x) <: key (T); 
e M” and N” are either 
* public keys pk(k), pk(k’) where 3T.I (k, k’) <: key™ (T); 
* or public keys pk(x), pk(x) where ST.I'(x) <: key™(T); 
* or a variable x such that 3T, T”. (x) = pkey(T) and T <: key™(T’); 
— or hashes h( M’), h( N’), where M’, N’ similarly contain a secret value under pairs; 
— or signatures sign(M’, M”), sign(N’, M”) where M”, N” are either 
e keys k, k’ where 3T.I (k, k’) <: key™(T); 
e or a variable x such that 3T.I (£) <: key ™ (T); 


step3,.(c) := If for all M ~ M’ and N ~ N’ € c such that M, N are unifiable with a 
most general unifier u, and such that 


Va € dom(u).Il, l’, m, p. (P(x) = [rh ; 7h J) => (zp E€ X V Fi. cp = mi) 


we have 
M'ab = N'ab 


where 
Va € dom(u). Vl, l,m, p, i. (T (£) = [re ; 7t] A p(x) = mi) > O(x) = pi 


and a is the restriction of u to {x € dom(y) | T(x) = LL A p(x) EN}; 
and if the symmetric condition for the case where M’, N” are unifiable holds as well, then 
return true. 


check_const(C) := for all (c, r) € C, let (c1, l1) := stepi,(c) and check that 
step2,, (c1) = true and step3,, (ci) = true. 


Fig. 10. Procedure for checking consistency. 
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2. We cannot directly check consistency of infinitely many constraints of the 
form [C'],Ux -+-Ux[C],,- Instead, we show that it is sufficient to check con- 
sistency of two copies [C],U,.[C], only. The reason why we need two copies 
(and not just one) is to detect when messages from different sessions may 
become equal. 


Formally, we can prove trace equivalence of replicated processes. 


Theorem 3. Consider P, Q, P’, Q’, C, C’, such that P, Q and P’, Q' do 
not share any variable. Consider I’, containing only keys and nonces with finite 
types. 

Assume that P and Q only bind nonces and keys with infinite nonce types, i.e. 
using new m : TE and new k : seskey!™ (T) for some label l and type T; while 
P' and Q' only bind nonces and keys with finite types, i.e. using new m : Th! 
and new k : seskey”! (T). 

Let us abbreviate by new Ti the sequence of declarations of each nonce m € 
dom(T) and session key k such that T (k, k) = seskey''(T) for some l, T. If 


-TFP.Q-C, 
-TPRFP AQ >C, 
- check_const([C],Ux[C],Ux[C’ ],) = true, 


then new 7. ((!P) | P’) = new 7. ((!Q) | Q’). 


Interestingly, Theorem 3 allows to consider a mix of finite and replicated pro- 
cesses. 


7 Experimental Results 


We implemented our typechecker as well as our procedure for consistency in a 
prototype tool TypeEq. We adapted the original prototype of [28] to implement 
additional cases corresponding to the new typing rules. This also required to 
design new heuristics w.r.t. the order in which typing rules should be applied. 
Of course, we also had to support for the new bikey types, and for arbitrary terms 
as keys. This represented a change of about 40% of the code of the software. We 
ran our experiments on a single Intel Xeon E5-2687Wv3 3.10 GHz core, with 
378 GB of RAM (shared with the 19 other cores). Actually, our own prototype 
does not require a large amount of RAM. However, some of the other tools we 
consider use more than 64GB of RAM on some examples (at which point we 
stopped the experiment). More precise figures about our tool are provided in the 
table of Fig. 11. The corresponding files can be found at [27]. 

We tested TypeEq on two symmetric key protocols that include a handshake 
on the key (Yahalom-Lowe and Needham-Schroeder symmetric key protocols). 
In both cases, we prove key usability of the exchanged key. Intuitively, we show 
that an attacker cannot distinguish between two encryptions of public constants: 
P.out(enc(a,k)) =, P.out(enc(b, k)). We also consider one standard asymmet- 
ric key protocol (Needham-Schroeder-Lowe protocol), showing strong secrecy of 
the exchanged nonce. 
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Helios [4] is a well known voting protocol. We show ballot privacy, in the 
presence of a dishonest board, assuming that voters do not revote (otherwise 
the protocol is subject to a copy attack [39], a variant of [30]). We consider a 
more precise model than the previous Helios models which assume that voters 
initially know the election public key. Here, we model the fact that voters actu- 
ally receive the (signed) freshly generated election public key from the network. 
The BAC protocol is one of the protocols embedded in the biometric passport [1]. 
We show anonymity of the passport holder P(A) ~, P(B). Actually, the only 
data that distinguish P(A) from P(B) are the private keys. Therefore we con- 
sider an additional step where the passport sends the identity of the agent to 
the reader, encrypted with the exchanged key. Finally, we consider the private 
authentication protocol, as described in this paper. 


7.1 Bounded Number of Sessions 


We first compare TypeEq with the tools for a bounded number of sessions. 
Namely, we consider Akiss [22], APTE [23] as well as its optimised variant 
with partial order reduction APTE-POR [10], SPEC [32], and SatEquiv [26]. 
We step by step increase the number of sessions until we reach a “complete” 
scenario where each role is instantiated by A talking to B, A talking to C, B 
talking to A, and B talking to C, where A, B are honest while C is dishonest. 


Protocols (# sessions)| Akiss| APTE| APTE-POR| Spec | Sat-Eq}— TypePg 
Time |Memory 
Needham - 3 4.2s | 0.39s| 0.086s |59.3s| 0.14s |0.006s| 4.0 MB 
Schroeder 6 TO | TO 9m22s TO | 0.53s |0.009s| 4.7 MB 
(symmetric) 10 SO 3.7s |0.012s| 5.0 MB 
14 18s |0.015s| 6.9 MB 
3 1.0s | 2.9s 0.095s 10s |0.063s]0.006s} 3.8 MB 
Yahalom - 6 |MO/} TO 11m20s_ | MO | 0.26s |0.017s| 4.9 MB 
Lowe 10 SO 3.0s |0.015s| 4.9 MB 
14 18s |0.019s} 5.0 MB 
Needham- 2 |0.10s} 3.8s 0.06s 28s x |0.004s| 3.1 MB 
Schroeder- 4 |1m8s| BUG BUG TO 0.004s| 3.4 MB 
Lowe 8 TO 0.007s| 4.7 MB 
Private 2 |0.19s} 1.2s 0.034s x x  |0.004s] 3.2 MB 
Authentication) 4 |99m | TO 24.6s 0.013s| 4.9 MB 
8 | MO TO ls |37 MB 
Helios 3 | MO | BUG BUG x x |0.005s| 3.5 MB 
2 | 4.0s |0.20s| 0.032s x x  |0.004s} 2.9 MB 
BAC 3 SO |185m 2.6s 0.004s| 3.1 MB 
5 TO 107m 0.005s| 3.4 MB 
7 TO 0.005s| 3.8 MB 


TO: Time Out (>12h) MO: Memory Overflow (>64GB) SO: Stack Overflow 


Fig. 11. Experimental results for the bounded case 
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This yields 14 sessions for symmetric-key protocols with two agents and one 
server, and 8 sessions for a protocol with two agents. In some cases, we further 
increase the number of sessions (replicating identical scenarios) to better com- 
pare tools performance. The results of our experiments are reported in Fig. 11. 
Note that SatEquiv fails to cover several cases because it does not handle asym- 
metric encryption nor else branches. 


7.2 Unbounded Number of Sessions 


We then compare TypeEq with Proverif. As shown in Fig. 12, the performances 
are similar except that ProVerif cannot prove Helios. The reason lies in the 
fact that Helios is actually subject to a copy attack if voters revote and ProVerif 
cannot properly handle processes that are executed only once. Similarly, Tamarin 
cannot properly handle the else branch of Helios (which models that the ballot 
box rejects duplicated ballots). Tamarin fails to prove that the underlying check 
either succeeds or fails on both sides. 


Protocols ProVerif|TypeEq 

Helios x 0.005s 
Needham-Schroeder (sym)| 0.23s | 0.016s 
Needham-Schroeder-Lowe]| 0.08s | 0.008s 


Yahalom-Lowe 0.48s | 0.020s 
Private Authentication 0.034s | 0.008s 
BAC 0.038s | 0.005s 


Fig. 12. Experimental results for an unbounded number of sessions 


8 Conclusion and Discussion 


We devise a new type system to reason about keys in the context of equivalence 
properties. Our new type system significantly enhances the preliminary work 
of [28], covering a larger class of protocols that includes key-exchange proto- 
cols, protocols with setup phases, as well as protocols that branch differently 
depending on the decryption key. 

Our type system requires a light type annotation that can be directly inferred 
from the structure of the messages. As future work, we plan to develop an auto- 
matic type inference system. In our case study, the only intricate case is the 
Helios protocol where the user has to write a refined type that corresponds to 
an over-approximation of any encrypted message. We plan to explore whether 
such types could be inferred automatically. 

We also plan to study how to add phases to our framework, in order to cover 
more properties (such as unlinkability). This would require to generalize our type 
system to account for the fact that the type of a key may depend on the phase 
in which it is used. 
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Another limitation of our type system is that it does not address pro- 
cesses with too dissimilar structure. While our type system goes beyond diff- 
equivalence, e.g. allowing else branches to be matched with then branches, we 
cannot prove equivalence of processes where traces of P are dynamically mapped 
to traces of Q, depending on the attacker’s behaviour. Such cases occur for exam- 
ple when proving unlinkability of the biometric passport. We plan to explore how 
to enrich our type system with additional rules that could cover such cases, tak- 
ing advantage of the modularity of the type system. 

Conversely, the fact that our type system discards processes that are in equiv- 
alence shows that our type system proves something stronger than trace equiv- 
alence. Indeed, processes P and Q have to follow some form of uniformity. We 
could exploit this to prove stronger properties like oblivious execution, prob- 
ably further restricting our typing rules, in order to prove e.g. the absence of 
side-channels of a certain form. 
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Abstract. Over the last few years, there has been an almost exponen- 
tial increase of the number of mobile applications that deal with sensi- 
tive data, such as applications for e-commerce or health. When dealing 
with sensitive data, classical authentication solutions based on username- 
password pairs are not enough, and multi-factor authentication solutions 
that combine two or more authentication elements of different categories 
are required. Many different such solutions are available, but they usu- 
ally cover the scenario of a user accessing web applications on their lap- 
tops, whereas in this paper we focus on native mobile applications. This 
changes the exploitable attack surface and thus requires a specific analy- 
sis. In this paper, we present the design, the formal specification and the 
security analysis of a solution that allows users to access different mobile 
applications through a multi-factor authentication solution providing a 
Single Sign-On experience. The formal and automated analysis that we 
performed validates the security goals of the solution we propose. 


1 Introduction 


Context and Motivations. Over the last few years, there has been an almost 
exponential increase of the number of mobile applications (or apps, for short) 
that deal with sensitive data, ranging from apps for e-commerce, banking and 
finance to apps for well-being and health. One of the main reasons behind such 
a success is that mobile apps considerably increase the portability and efficiency 
of online services. Banking apps allow users not only to check their account 
balances but also to move money and pay bills or friends [1]. Mobile health 
apps range from personal health records (PHR) to personal digital assistants 
using connected devices such as smartwatches and other body-worn devices or 
implants. As reported in [2], there are nowadays more than 100,000 mobile health 
apps on the market, a number that is increasing on a weekly basis. 

However, also the reports on security and privacy issues in mobile apps 
are increasing on a weekly basis, bearing concrete witness to the fact that the 
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management of sensitive data is often not properly taken into account by the 
developers of the apps. For example, the studies performed by He et al. [3] on 
free mobile health apps available on the Google Play store show that the major- 
ity of these apps send sensitive data in clear text and store it on third party 
servers that do not support the required confidentiality measures. 

When dealing with sensitive data, classical authentication solutions based 
on username-password pairs are not enough. The “General Data Protection 
Regulation” [4] mandates that specific security measures must be implemented, 
including multi-factor authentication, a strong(er) authentication solution that 
combines two or more authentication elements of different categories (e.g., 
a password combined with a pin sent to a mobile device, or some biomet- 
ric data). There are many alternative solutions on the market for providing 
multi-factor authentication. Examples are FIDO (Fast [Dentity Online, https:// 
fidoalliance.org), which enables mobile devices to act as U2F (Universal 2nd 
Factor) authentication devices over Bluetooth or NFC, and Mobile Connect 
(https://mobileconnect.io), which identifies users through their mobile phone 
numbers. 

In addition to the establishment of high-level security for authentication solu- 
tions for mobile apps, it is essential to take the usability aspect into considera- 
tion. Monitoring apps often require a daily or even hourly use, but understand- 
ably users cannot be bothered by a long and complex authentication procedure 
each time they want to read or update their data, especially on mobile devices 
where the keyboard is small and sometimes uncomfortable to use. A better 
usability can be provided by supporting a Single Sign-On (SSO) experience, 
which allows users to access different, federated apps by performing a single 
login carried out with a selected identity provider (e.g., Facebook or Google). 
While the authentication session is valid, users can directly access all the apps 
in the federation, without having to enter their credentials again and again. 


Contributions. In this paper, we present the design, the formal specification and 
the security analysis of a solution that allows users to access different mobile apps 
through a multi-factor authentication solution providing a SSO experience. 

We focus on multi-factor authentication solutions that use One Time Pass- 
words (OTPs), which are passwords that are valid for a short time and can 
only be used once. We have selected OTP-generation approaches as they are 
commonly used to provide strong authentication and many alternative solu- 
tions (from physical to software tools) are available on the market. For instance, 
Google Authenticator is a mobile app that generates OTPs [5]. Like Google 
Authenticator, many of the OTP-generation solutions on the market are appli- 
cable only for web solutions and use mobile devices as an additional factor. 

However, in the scenario considered in this paper, users are not accessing 
web apps on their laptops or desktop computers, but instead they are accessing 
native mobile apps. In relation to SSO and multi-factor authentication, web 
and mobile environments and channels guarantee different security properties, 
e.g., in web scenarios identity providers can authenticate service provider apps 
using shared secrets, but this is not possible for native mobile apps that are 
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unable to keep values secret. This changes the exploitable attack surface and 
thus requires a specific analysis. To the best of our knowledge, the definition of 
a multi-factor authentication solution for native apps is still not well specified. 
Even if there are some solutions currently used, their security analyses have 
been performed informally or semi-formally at best, and without following a 
standardized formal procedure. This makes a comparison between the different 
solutions both complex and potentially misleading. 

For the security assumptions and the design of a native SSO solution, our 
work is based on [6,7]. In this previous work, we presented a solution for native 
SSO and performed a semi-formal security analysis. In this work, we extend 
these studies by providing a multi-factor authentication solution and a formal 
analysis of the identified security goals. 

Summarizing, our contributions are four-fold as we have 


1. designed a multi-factor authentication solution that uses OTPs as an authen- 
ticator factor and provides a SSO experience for native apps; 

2. provided a description of the proposed solution detailing the security and 
trust assumptions; 

3. formally defined the security goals of our multi-factor authentication solution; 

4. formally analyzed our solution by modeling the flow, assumptions and goals 
using a formal language (ASLan++) and model-checking the identified secu- 
rity goals with the SATMC tool. 


The results of our analysis show that our solution behaves as expected. 


Organization. Section 2 provides background on strong authentication solutions 
and SSO for native mobile apps, and on ASLan++ and SATMC. Section3 
describes the design of the proposed multi-factor authentication solution, dis- 
cusses the peculiarities of a multi-factor authentication solution compared to a 
basic username-password authentication, and identifies the corresponding secu- 
rity assumptions and security goals. For concreteness, Sect. 4 describes our solu- 
tion in the context of mHealth apps, and the solution is then formally analyzed 
using SATMC. Section 5 discusses related work and Sect. 6 draws conclusions. 


2 Background 


This section provides the basic notions required to understand the proposed 
design for a multi-factor authentication solution that supports a SSO experi- 
ence and its security assessment. In Sect. 2.1, we describe the entities involved 
in a multi-factor authentication and SSO solution, discuss the different OTP- 
generation approaches, and identify the functional requirements of a native SSO 
solution. In Sect. 2.2, we provide useful background for our formal analysis. 


2.1 Multi-factor Authentication and Native SSO 


The entities involved in a multi-factor native SSO solution are: a User (User) 
that wants to access a native Service Provider app (SPc); an Identity Provider 
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server (IdPg) that manages the digital identities of the users and provides the 
multi-factor process; a User Agent (UA), which could be a browser or a native 
app used to perform the multi-factor process between the SP¢ and IdPs. Option- 
ally, the SP¢ app could have a backend server (SPs). 

A multi-factor authentication solution augments the security of the basic 
username-password authentication by exploiting two or more authentication fac- 
tors. In [8], it is defined as: 


“a procedure based on the use of two or more of the following elements — 
categorised as knowledge, ownership and inherence: i) something only the 
user knows, e.g., static password, code, personal identification number; ii) 
something only the user possesses, e.g., token, smart card, mobile phone; 
tii) something the user is, e.g. biometric characteristic, such as a finger- 
print. In addition, the elements selected must be mutually independent [...] 
at least one of the elements should be non-reusable and non-replicable”. 


The more factors are used during the authentication process, the more confidence 
a service has that the user is correctly identified. 

There are many multi-factor techniques on the market. In this paper, we 
focus on a well-accepted solution that combines a PIN code (“something only 
the user knows”) with the generation of an OTP using a software OTP generator 
(“something only the user possesses”). When an OTP-generation approach is 
used, a different password is generated for each authentication request and is 
valid only once, providing a fresh authentication property. Thus, compromising 
an old OTP does not have security consequences in the authentication process. 

There exist many algorithms for generating OTPs and we can classify them 
into three main OTP-generation approaches: 


— Time synchronization: the OTP is generated starting from a shared secret 
key (called seed) and the current time of the operation. IdPs must validate 
this value: only OTPs that fall into a short temporal range are accepted. 

— Lamport’s algorithm [9]: the first OTP is generated from a seed value and each 
successor OTP value is based on the value of its predecessor. For example, 
if s is a seed value and F(x) is a one-way function, we have the following 
OTPs: 01 = 8, 0g = F(01), 03 = F(02),... On = F(0n_1). The last OTP, on, 
is stored on IdPs. When a User wants to login, she sends o,,_ ; to the server, 
and the server applies the function F and checks that the result corresponds 
to the stored value. If the two values correspond, IdPs authenticates User 
and updates the stored value with o,_ 1. In the next login, User will use 0,2 
and so on. After n logins, User has to change the seed value and calculate 
new OTP values. 

— Challenge/Response: in the execution of this approach, IdPs presents a “chal- 
lenge” (e.g., a random number) and User answers with a valid “response”, 
which is an OTP value calculated using a mathematical algorithm starting 
from the challenge. 
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Although our solution is parametric in the OTP-generation approach, in 
Sect.4, we will detail and analyze the time synchronization approach in the 
context of a real-world scenario. 

Native SSO protocols allow users to access multiple SP¢ apps through a sin- 
gle authentication performed with an IdPg. As identified in [6], the two require- 
ments that we expect for a native SSO solution are: (i) the IdP user credentials 
can be used to gain access to several SP¢ apps—this implies that a User does not 
need to have credentials with a SP¢ to access it; (ii) if a User has already a login 
session with an [dPs, then she can access new SPc apps without re-entering her 
IdP credentials—only the User consent is required. 


2.2 Formal Analysis: ASLan++ and SATMC 


The use of formal languages and automatic tools for analyzing security protocols 
has allowed researchers to uncover a large number of vulnerabilities in protocols 
that had been thought to be, or even informally proved to be, secure. Famous 
examples range from protocols such as the Needham-Schroeder Public Key pro- 
tocol to Kerberos or TLS (see [10] for details). These examples underline how the 
design of a protocol that requires specific security goals is not a simple task, as its 
security depends on several assumptions on trust and communication channels 
(e.g., the federation between the involved parties, and the transport protocol 
used in the message exchange). Several formal languages have been developed, 
all sharing the idea to extract from the protocol message flow a description of the 
entities involved, the exchanged messages and the channel assumptions. Formal 
protocol specifications are then given in input to automated tools that check the 
desired security goals of the protocol against realistic threat models. 

In this paper, we use ASLan++ [11], the input specification language of 
the AVANTSSAR Platform [12]. ASLan+-+ is a high-level formal language that 
formalizes the interactions between the different protocol roles, where a role 
represents a sequence of operations (e.g., sending and receiving messages) that 
must be executed by the entity that plays that role. ASLan++ supports the 
specification of different channel assumptions and security goals, most notably 
different variants of authentication and confidentiality. In our analysis, we use 
SATMC [13], which is one of the model checkers of the AVANTSSAR platform. 
SATMC uses state-of-the-art SAT Solvers and allows for the specification of 
security goals written using the Linear Temporal Logic. 


3 Description of Our mID(OTP) Solution 


In this section, we present a mobile identity management solution that augments 
the security of the native SSO solution proposed in [6] by adding a multi-factor 
authentication based on the generation of OTPs. We called it mID(OTP) to 
highlight the dual goal that our solution pursued: (i) to establish a multi-factor 
authentication and (ii) to manage identities for native mobile apps, e.g., provid- 
ing a SSO service. As we will describe, mID(OTP) is parametric on the OTP 
generation (i.e., it supports different OTP-generation approaches). 
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In the mobile context, two possible design choices are available: a UA could be 
played either by a browser (external or embedded in the SPc¢ app) or by a native 
app. In the design of mID(OTP), we have preferred the latter choice, as a native 
app can be (easily) extended to support the generation of an authentication 
factor (e.g., by adding the code for a OTP generator or a library to process the 
user’s fingerprint). In addition, as the UA is involved in the authentication phase 
with the IdPs, it must be trusted in knowing the user’s IdP credentials. Thus, 
we assume that this native app, called IDOTP, is released directly by the IdPs. 

mID(OTP) consists of three phases: registration, activation and exploitation, 
which we describe in the following subsections. 


3.1 Registration and Activation Phases of mID(OTP) 


The registration phase of mID(OTP) is performed by the SPg developers and 
corresponds to the exchange of some information about SP, such as the package 
name and logo, together with its certificate fingerprint key_hash (i.e., the hash 
of the certificate of the app). Note that key_hash depends on the private key 
of the SPo developer and is thus different for apps by different developers. The 
registration phase can be performed in different ways, e.g., entering the data into 
an online dashboard or via an email exchange. As a trust relationship between 
SPc and IdPg is established as result of the registration phase, it is important 
that the IdPs validates the SPc¢ data and in some cases (e.g., when user personal 
or sensitive data are involved) a service-level agreement could be required as well. 

The activation phase of mID(OTP) is performed by the User to configure the 
native app IDOTP on her smartphone. In addition to the procedure described 
in [6]—user login and release of a token (token_IdP) used (from here on) to iden- 
tify the user session in place of the user credentials—at the end of the activation 
phase the IDOTP is configured to generate OTPs, usually requiring the creation 
of a PIN code for the future interactions. 

Also the activation phase can be performed in different ways. As a multi- 
factor authentication is configured during this phase, it is essential to provide 
the User with an activation code—exchanged using a secure channel (e.g., after 
an in-person identification)—that she has to enter during the process. 


3.2 Exploitation Phase of mID(OTP) 


The exploitation phase of mID(OTP), which is shown in Fig.1, is performed 
every time the User accesses a SP that requires the multi-factor authentication 
and SSO experience offered by IDOTP. In Step S1, User opens the SP¢ app 
that sends a request to SPs including a session token token_sync (Step S2). SPs 
checks the validity of token_sync. If token_sync has expired, SPs sends an error 
message asking for a login to SP¢ (Step S3), otherwise Step S7 is executed. If 
a login form is presented to User, she clicks the login button (Step Al) and 
SPco sends a login request to IDOTP (Step A2). As a consequence, in Step A3 
IDOTP reads the key_hash value of SPc and in Step A4 sends a request to 
IdPs asking the SPc data. The received key_hash is used by IdPs to validate 
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Fig. 1. Exploitation phase of mID(OTP). 


the SPo identity. If SPc is valid, IdPgs returns to IDOTP a consent containing 
the meta-data of SPo (Step A5). In Step A6, User checks whether SPo is the 
app that she wants to access and decides whether to give her consent or not. If 
User agrees, the OTP is generated following one of the approaches described in 
Sect. 2.1 (Step A7). Then, in Step A8, IDOTP sends a token request to IdPs 
including the OTP value, key_hash and token_IdP, which corresponds to the user 
credentials entered during the activation phase. IdPs checks the validity of OTP, 
key_hash and token_IdP. If they are valid, a token (token_SP) for the SP app is 
returned (Step A9). token_SP contains the identity of User, IdPs and SP, and is 
digitally signed with K Tara the private key of IdPs. In Step A10, IDOTP returns 
token- -SP to SPco as result of Step A2. To finalize the authentication, SP¢ 
sends a token request to SPs with token_SP (Step S4). SPs checks the validity 
of token_SP, and if it is valid, creates and sends to SPço a token token_sync 
(Step S5). This token will be used by SPc to synchronize user data in the 
future interactions, until its expiration. When SPc¢ needs to synchronize data, 
sends a request to the SPs including token_sync (Step S6), and SPs returns the 
requested resource to SPo (Step S7). 

We have labeled the steps with “S” and “A”. The S steps are related to the SP 
(but note that our representation is only an example and each SP could support 
different solutions). The A steps represent the steps related to the authentication 
solution. As the S steps can vary depending on the choices of the SP developers, 
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in our analysis, we will focus on the A steps. Compared to the protocol flow 
proposed in [6], we have enhanced its security by adding the generation, exchange 
and validation of OTPs. For example, the OTP extension protects mainly against 
a stolen smartphone. Indeed, even if the user’s smartphone is stolen, the intruder 
cannot login as the victim without generating the expected OTP. 


3.3 Towards a Formal Specification of Multi-factor Authentication 


We now discuss the peculiarities of a multi-factor authentication solution com- 
pared to a basic username-password authentication; in doing so, we introduce 
some concepts that will be the key for the formal analysis. 

In a basic username-password authentication, the expected security goal is: 


(G1,) SP authenticates User 


Here, User is required to provide an authentication factor: either credentials 
(something only she knows) or a session token (e.g., a cookie stored in her 
browser) in order to properly complete the authentication process. If this is 
the case, it is possible to specify a minimum set of security assumptions (e.g., 
on the behavior of User or on the communication channels) that are necessary 
to guarantee Gly. For example, if the channel used for the login is not https, 
then an intruder can eavesdrop the User’s password and impersonate her in the 
future. We call these assumptions strong assumptions (to distinguish them from 
the weak assumptions that we define later). 

A multi-factor authentication solution augments the security of the basic 
username-password authentication by exploiting two or more authentication fac- 
tors. By the definition given in Sect. 2.1, we infer that mID(OTP) is a two-factor 
authentication solution using knowledge and ownership elements (factors). We 
do not consider inherence factors. In addition, instead of considering the inde- 
pendent factors, we introduce the concept of instance-factors. 

We call instance-factor (IFactor) every specific instance of either an owner- 
ship factor (IFactor,) or a knowledge factor (IF actor;). The multi-factor authen- 
tication solution mID(OTP) that we propose contains three instance-factors: 


— the Factor, token_IdP that is stored in IDOTP and in IdPg as a result of 
the activation phase (used as a session token in place of the user credentials 
to provide a SSO experience); 

— another [Factor;, that can vary according to the specific OTP generator used, 
e.g., a PIN known by the user (used to protect the OTP generator); 

— an IFactor, that is stored in IDOTP (and possibly shared with I[dPs), accord- 
ing to the OTP-generator approach used (e.g., a seed value or a private key). 


Note that the [Factor, token_IdP is present in all instances of our solution, 
whereas the other two factors may differ depending on the specific solution (and 
this is the reason why we cannot name them explicitly a priori). 

Compared to classic notion of authentication factors, instance-factors can 
have a dependency. For example, the two [Factor, are stored in IDOTP. Thus, 
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by breaching the IDOTP app both of them are compromised. However, it is 
important to note that different mitigations can be implemented for the differ- 
ent instance-factors. For example, in our solution, if a User realizes that the 
IDOTP has been compromised (e.g., if her smartphone has been stolen), she 
can invalidate token_IdP, thus blocking possible attacks. 

We are not aware of any formal definition of the multi-factor authentication 
property apart from [14]. In [14] they analyzed a two-factor and two-channel 
authentication solution that combines a classic single-factor solution with the 
exchange of a second factor using the GSM/3G/4G communication infrastruc- 
ture of the user’s mobile phone. By generalizing the definition in [14] by consid- 
ering a solution involving n instance-factors, we can define the following security 
goal: 


(Glmra) Goal Gly (i.e., SP authenticates User) holds even if an intruder 
knows up to n — 1 instance-factors. 


Thus, the addition of instance-factors ensures some “redundancy”, meaning that 
even if one of them is compromised there are no attacks. 

We call weak assumption (wa) an assumption that, whenever it is not valid or 
not implemented properly, causes the disclosure of a non-empty set of instance- 
factors of the same type, i.e., either [Factor, or IFactor,. We refer to this 
set as the set of instance-factors associated with wa and denote it by writing 
IF(wa).' For example, if a weak assumption wal states that the intruder can- 
not read the values typed by User, and in the authentication process User has 
to enter her password and PIN, then IF (wal) = {password, PIN}. This def- 
inition can be easily extended to a set of weak assumptions WA’ as follows: 
IF(WA') = Uae war IF (wai). We write WA to denote the set of all the weak 


assumptions. 


Defining Security Goals. The notions that we just introduced allow us to 
rephrase the definition of the security goal Gl mra of a multi-factor authentica- 
tion solution in the following way: 


(Glmra) Goal G14 holds under the strong assumptions and under chosen sub- 
sets of weak assumptions ( WA’) such that the set of instance fac- 
tors associated to WA \ WA’ does not include all the instance-factors. 
That is, |IF(WA\ WA’)| < n. 


A main characteristic of mID(OTP) is the use of OTPs. In Gl mra, we con- 
sidered (among others) the instance-factors linked to the OTP generation. In 
addition, as reported in Sect.2, an OTP “should be non-reusable and non- 
replicable.” Indeed, if the OTP is not fresh, then the knowledge of an OTP 
leads to the same attacks possible when knowing the instance-factors linked to 


1 To compromise all instance-factors, at least two weak assumptions must be not valid. 


Multi-Factor Authentication Solutions with a SSO Experience 197 


its generation. Thus, it is crucial that the following security goal about the OTP 
is satisfied: 


(G2) The OTP must prove its origin (meaning that [dPs authenticates IDOTP, 
as IDOTP is the only app that possesses a secret value shared with IdPs 
or a private key), and it is non-reusable (i.e., JdPs accepts only one OTP 
for a specific operation so as to avoid replay attacks). 


3.4 Assumptions 


Our solution is based on different security assumptions, which we have classified 
as strong or weak assumptions. 


Strong Assumptions. We have identified the following assumptions and 
checked them to be strong assumptions (see Sect.4.5): Trust Assumption that 
clarifies the trust relationships between the different entities, Communication 
Assumptions that specify the concrete implementation of the communication 
channels required in mID(OTP), and Activation Assumption that identifies the 
assumptions related to the activation phase of mID(OTP). 


Trust Assumption. mID(OTP) is based on the following trust relationship: 
(TA) IdPs is trusted by SPc. 


Communication Assumptions. Communications between the parties are subject 
to the following assumptions: 


(ComA1) The communication between SPo and IDOTP is carried over 
an inter-app communication implemented using StartActivity 
ForResult(). This Android method—which allows an app to open 
another app and get a result back—guarantees that the SPo app 
that sends a request to IDOTP at Step A2 in Fig. 1 is the same app 
that receives the result back from IDOTP at Step A10. 

(ComA2) To read the key_hash value (Step A3 of Fig.1), IDOTP 
uses the Android method getPackageInfo(client packageName, 
PackageManager. GET SIGNATURES), which extracts the informa- 
tion about the certificate fingerprint included in the package of SPc. 

(ComA3) The communication between IDOTP and IdPg occurs over a unilat- 
eral SSL or TLS channel (henceforth SSL/TLS), established through 
the exchange of a valid certificate (from I[dPs to IDOTP). 


Note that even if these assumptions refer to a concrete implementation of the 
communication channels, in Sect.4.3 we will provide the formal counterpart 
abstracting away the implementation details. By doing so, any implementation 
satisfying the abstract assumptions can be used in place of the implementation 
mentioned above (e.g., considering a similar solution in the case of iOS), and 
the results of our security analysis still hold. For example, the main reason to 
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have ComA1 is to avoid the eavesdropping of the identity assertion (token_SP) 
by a malicious app, as in this way an intruder can use it to impersonate the 
user on another smartphone. An alternative implementation of ComA1 could be 
obtained by requiring SPc to insert a fresh value in the token request. In this 
way, SPc will accept only the token_SP that includes the expected fresh value. 
Regardless of the design choice, it is crucial that SP¢ (and SPs if it is involved) 
only accepts tokens that are released for itself for a particular operation. 


Activation Assumption. Phishing attacks (e.g., a malicious app that creates a 
fake login form and steals the user’s credentials) are one of the most common 
types of attack and usually are beyond the scope of an authentication protocol. In 
our analysis, together with a secure communication, we assume that no phishing 
is possible during the activation phase: 


(ActivA) The activation phase is correctly performed by User. That is, User 
downloads the correct IDOTP (it is not a fake app) and correctly 
follows the process, and the communication channels used are secure. 


Weak Assumptions. We have identified two categories for weak assump- 
tions: Background Assumptions that specify the assumptions on the environ- 
ment (user’s smartphone), and User Behavior Assumptions that specify which 
user behaviors are allowed in our model. 


Background Assumptions. The environment is subject to these assumptions: 


(BA1) Integrity and confidentiality of data stored in the device. 
(BA2) There is no surveillance software (e.g., keylogger) installed on the user’s 
device capable of reading the values that User types. 


User Behavior Assumptions. To enforce a correct execution of the flow and to 
investigate the security consequences of a stolen smartphone, in our analysis we 
take into account the following behavioral rules: 


(UBA1) User enters her [Factor; only in the correct IDOTP app being careful 
not to be seen by other people. 

(UBA2) User is the only person using the IDOTP app that stores the [Factor 
associated to her identity. 


4 Formal Specification and Analysis of the mID(OTP) 
Solution: The mHealth Use-Case 


In this section, we describe how the semi-formal description of the mID(OTP) 
solution can be translated into a formal model (in this case, specified in 
ASLan++). mID(OTP) provides a general solution for several application con- 
texts. Instead of presenting at first the general model and then the formalization 
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of a use-case, for brevity and concreteness, here we describe directly the for- 
malization of a real use-case scenario that involves mHealth (mobile health) 
apps. All the concepts presented apply in general to every solution based on 
mID(OTP) (apart from a trivial renaming of the entities). Only the steps and 
instance-factors related to the particular OTP generator used are specific for 
this use-case. 

In Sect. 4.1, we describe the entities and the steps of the OTP-generator app- 
roach for this use-case. In Sects. 4.2 and 4.3, we detail the mapping between the 
assumptions and their formal specification. In Sect. 4.4, we give the formalization 
of the security goals. In Sect. 4.5, we present the results of our security analysis. 


4.1 Description of the TreC Scenario 


TreC is an acronym for “Cartella Clinica del Cittadino”, i.e., “Citizens’ Clinical 
Record”. TreC is a platform developed in the Trentino region (Italy) for man- 
aging personal health records (PHRs).” In addition to the web platform, which 
is routinely used by around 80,000 users, TreC is currently designing and imple- 
menting a number of native Android applications to support self-management 
and remote monitoring of chronic conditions. These applications are used in a 
“living lab” by voluntary chronic patients according to their hospital physicians. 
Examples are: 


— “TreC-Lab: Diario Diabete”, a mobile diary that allows patients to record 
health data, such as the blood glucose level and physical activity, and 

— “TreC: Referti”, which permits patients to consult their personal health data 
and medical prescriptions from the smartphone. 


In the traditional web scenario, patients access services using their local health- 
care system credentials (leveraging a SAML-based SSO [15] solution), but a 
solution for native SSO was missing. The solution we have proposed will allow 
patients to access different TreC e-health native mobile apps (and possibly other 
third-party e-health apps) through a single authentication act. An implementa- 
tion of the proposed model is currently being tested by TreC users. 

In the following, we instantiate the entities described in Sect.3 with the 
entities involved in TreC: Patient plays the role of User who wants to access 
her PHR on her smartphone. ADC (“Autenticazione del Cittadino”) is the IdP 
of the local health care system and plays the role of IdPs. OTP-PAT plays the 
role of IDOTP and manages the generation of OTPs and the SSO experience 
for the apps installed on the phone that are part of the federation. TreCg (TreC 
client) plays the role of Po and is one of the apps that are part of the ADC 
federation and it is used by Patient to read her PHR. TreCs (TreC server) plays 
the role of SPs and manages user health data. 

Figure2 shows the A-steps of the exploitation phase of mID(OTP) for this 
use-case. Compared to Fig. 1, we have detailed the OTP generation box (steps 
A7 a-c), and graphically shown the channel properties, which we will explain 


? More information is available at https://trec.trentinosalute.net /. 
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Fig. 2. MSC of the exploitation phase of the TreC scenario. 


in Sect. 4.3. Given that TreCs is not involved in the A-steps, for the sake of 
brevity, in the rest of the section we refer to TreCc simply with TreC. Steps A7 
(a-c) model the behavior of a Time-OTP (TOTP) algorithm [16], which is a 
time synchronization algorithm that generates OTPs as a function of the time 
of the execution and a seed (i.e., a shared secret). In general, the TOTP algo- 
rithm requires that “the prover and verifier must either share the same secret 
or the knowledge of a secret transformation to generate a shared secret” [16], 
without specifying when and how to exchange this secret. In the analyzed use- 
case, OTP-PAT obtains the seed value as part of the activation phase, and then 
stores it encrypted with the PIN code ({|seed|}_PIN) selected by Patient. Thus, 
the OTP generation box depicted in Fig. 1 is replaced here with a PIN request 
(Steps A7.a), the entering of the PIN (Steps A7.b) and the generation of the 
OTP as a function of the seed—extracted using the PIN as decryption key—and 
of time (Steps A7.c). 

The TreC scenario corresponds to a multi-factor authentication with 3 
instance-factors: token_IdP and {|seed|}_PIN are IFactor,, and PIN is an 
IFactory. 

In the rest of this section, we present the formalism that we have used to 
specify this use-case, detailing the initial state and the behavior of the entities, 
the channels and the security goals. We also describe how we have formalized 
the assumptions presented in Sect. 3.4. In Table 1, we show each assumption and 
the corresponding formal specification. In addition, we model what in Sect. 3.3 is 
indicated as an assumption not valid or not implemented properly by removing 
it from the formal model, as shown in the last column of Table 1. 
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4.2 Formal Specification of the Initial State and of the Behavior 
of Entities 


Initial States. The initial state of a protocol defines the initial knowledge of 
the intruder, who is indicated with the letter i, and of all the honest entities 
that participate in the protocol session, where a protocol session is a particular 
run of the protocol, played by specific entities, using specific instances of the 
communication channels and optionally, additional parameters that must be 
passed as initial knowledge to the different entities. To model the TA assumption, 
as shown in Tablel, in our analysis we have not considered sessions with i 
playing the role of OTP-PAT and ADC. 

Regarding the registration phase, we have modeled the data provided by the 
TreC developer as initial knowledge of ADC. In general, after the registration 
phase, IdPs creates two databases: trustedSPs, containing the relation between 
the SPc identities and their key_hash values, and metadataDB, containing the 
relation between the key_hash and the information (e.g., name and logo) pro- 
vided by the SP developers. As shown in Tablel by the ActivA assumption, 
we have modeled the data obtained as result of the activation phase (token_IdP 
and data required for generating OTPs) as initial knowledge of User, IDOTP 


Table 1. Mapping between assumptions (Asm(s) for short) and formal specification. 


Asm Formal specification 
Specification of Asm Removal of Asm 

TA We do not consider sessions with i add sessions with i playing the role of 
playing the role of ADC ADC 

ComA1 | 1ink(T20,02T) ; delete link(T20,02T); 

ComA2 | authentic_on(T20,TreC) ; and DB delete authentic_on(T20,TreC) ; 
Keyhash 

ComA3 | confidential_to(02A, ADC) ; delete confidential_to(02A, ADC) ; 
weakly_authentic(02A) ; weakly_authentic(02A) ; 
weakly_confidential (A20) ; weakly_confidential (A20) ; 
authentic_on(A20, ADC) ; authentic_on(A20,ADC); link(02A,A20) ; 
link (02A,A20) ; 

ActivA | Data obtained during the activation add iknows(pinUser); 
phase are nonpublic values shared as | iknows(token_IDP); 
parameters between Patient, iknows({|seed|}_pinUser); in general 
OTP-PAT and ADC add all the iknows(/ Factor); obtained 

during the activation phase 

BAI “Built-in”: i cannot read the internal | add iknows(token_IDP); and 

state of the other entities iknows ({|seed|}_pinUser); in general 
add all the iknows (J Factorp); 

BA2 “Built-in”: i cannot read the internal | add iknows(pinUser); in general add all 
state of the other entities the iknows (J Factor,); 

UBAI | confidential_to(P20,0TP-PAT) ; delete confidential_to(P20,OTP-PAT) ; 

UBA2 | authentic_on(P20,Patient) ; delete authentic_on(P20,Patient) ; 
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and IdPs. In particular, for the use-case, as result of the activation phase: 
a Patient knows her PIN value (pinUser), OTP-PAT knows token_IDP and 
{|seed|}_pinUser, and ADC creates a DB (usersDB) with Patient, token_IDP 
and seed as entry. 

To specify that the intruder knows a message m, we use the ASLan++ pred- 
icate iknows(m). As shown in Table 1 for ActivA, BA1 and BA2, the removal of 
an assumption (which we will do to consider different scenarios of the analysis) 
boils down to adding some iknows facts to the initial knowledge of the intruder. 


Behavior of Entities. The behavior of the honest entities is specified by the 
evolution of the system, which consists of a sequence of operations performed 
by each role. For simplicity, Fig.3 shows the evolution of the protocol using a 
process view, which describes the messages exchanged in Fig. 2 for each entity 
as a set of actions (e.g., receive or send a message and DB access). This formal 
representation can be translated into various role-based formal languages and 
input to different state-of-the-art security protocol analyzers. In our analysis, we 
use ASLan++ and SATMC (see [11] for more details on language and tool). 

The translation of the process view into ASLan++ is quite straightfor- 
ward. The complete ASLan+-+ specification can be found at https: //st.fbk.eu/ 
publications/POST-2018. Here, for lack of space, we provide only an example 
by considering Steps 1 and 2 of Fig.2, which involve the entities Patient, TreC 
and OTP-PAT. Focusing on TreC’, this exchange of messages in ASLan+-+ cor- 
responds to 


Patient -Ch_P2T-> Actor: Request; % Step 1 
Actor -Ch_T20-> OTP-PAT: Actor; % Step 2 


where Actor is the keyword used in ASLan+-+ to represent the entity taken into 
consideration, in our example TreC’. 

In our analysis, we have considered the behavior of a Dolev-Yao intruder [17], 
who can overhear and modify messages using his initial knowledge and the knowl- 
edge obtained from the traffic—this behavior is built-in in the SATMC tool. An 
operation that is not allowed to i is the reading of the internal state of another 
entity, where an internal state is a list of expressions known by the corresponding 
entity. Thus, as highlighted in Table 1, BA1 and BA2 are built-in in the tool. 


4.3 Formal Specification of Channels 


For a detailed definition of the properties of channels between two protocol 
entities A and B we point the reader to [18,19]. In a nutshell, consider a message 
M sent on a channel A2B from A to B. A2B is authentic if B can rely on the fact 
that only A could have sent M. A2B is confidential if A can rely on the fact that 
only B can receive M. A2B is weakly authentic if the channel input is exclusively 
accessible to a single, but yet unknown, sender, and A2B is weakly confidential if 
the channel output is exclusively accessible to a single, yet unknown, receiver. A 
link between two channels A2B and B2A means that the entity sending messages 
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P2T?Request 


P2T! Request 
T20!Actor 


O2P?PINRequest A 
P20!PIN O2T?{ADC.Patient.Actor}_inv (pk (ADC) ) 
Patient TreC 
start 
“© 
T20?Trec 


check (TreC,Keyhash) in infoMobile 
O2A!Keyhash 


start 
2 1 
O2A?Keyhash 


A20?Metadata 
userconsent check (Keyhash,Metadata) in metadataDB 
O2P!PINRequest A20!Metadata 
© 2 
P20?PIN ; O2A?otp generation (Seed, Time). 
check (PIN) in PIN IDOTP TreC.Keyhash.Token_IDP 
O2A!otp_generation (Seed, Time). check (TreC,Keyhash) in trustedSPs 
TreC.Keyhash.Token_IDP check (Patient, Seed, Token IDP) in usersDB 
© A20!{ADC.Patient.TreC} inv (pk (ADC) ) 


A20?{ADC.Patient.TreC} inv (pk (ADC) ) 
O2T!{ADC.Patient.TreC} inv (pk (ADC) ) ADC 


OTP-PAT 


Legend: 


e P,T,O,and A stands for Patient, TreC, OTP-PAT, and ADC respectively, and P2T, P20, O2P, 
O2T, O2A, T20, A20 are their unidirectional channels. 

e Ch! M means that message M is sent over channel Ch. 

e Ch?M means that a message, says M, is received over channel Ch and a variable X is set to M. 

ə M1 .M2 is the concatenation of messages M/ and M2. 

e check(X,Y,...,Z) in DB means that (X,Y,...,Z) must be in DB, otherwise the protocol stops. 

e M_inv (pk (ADC) ) means that message M is digitally signed with the private key of ADC. 


Fig. 3. Protocol view. 


over the A2B is the same entity that receives messages from B2A. We have 
represented these properties graphically in Fig.2 as follows: A e—> B, A o> 
B, A —e B, A —o B mean authentic, weak authentic, confidential and weak 
confidential channel, respectively; moreover, we indicate a link property between 
two channels with the same trace for the corresponding arrows. 

As shown in Table 1, we have modeled as channel properties the tree commu- 
nication assumptions (ComA1, ComA2 and ComA3) and the two user behavior 
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assumptions (UBAI and UBA2). The modeling of these assumption is far from 
a trivial mapping and requires an explanation. 

ComA1 is related to the inter-app communication in the mobile. The property 
expected by the StartActivityForResult method can be modeled by a link 
property between the two channels used in the mobile: the app that has sent a 
request is the same app that will receive the result. 

ComA3 is modeled with five channel properties (see Table 1) that all together 
model a TLS/SSL unilateral channel. 

Regarding ComA2, we have modeled an Android method, which extracts the 
key_hash value included in the package of an app, using an authentic channel 
(used by TreC to send its identity to OTP-PAT) and a DB containing the 
relations between the SPc identities and their key_hash, used by OTP-PAT 
to read the correct key_hash value. This is due to the fact that this method— 
executed by the Android OS—guarantees the authenticity of its output. 

We have modeled UBA1 and UBA2 as properties of the channel from Patient 
to OTP-PAT (P20). UBA1 is necessary to prevent leakage of the PIN—entered 
in a malicious app or watched by an intruder during the typing—thus, we have 
modeled P20 as a confidential channel. UBA2 guarantees the possession of the 
OTP-PAT app installed in the user’s smartphone. Having this assumption, only 
the valid Patient can communicate with that particular installation of OTP- 
PAT, thus we have modeled P20 as an authentic channel. 


4.4 Formal Specification of Security Goals 


As described in Sect. 3.3, we have defined G1 mra in terms of a traditional authen- 
tication goal and the strong and weak assumptions. This means that, in the 
formal model, we consider the traditional authentication goal G14 and we check 
whether it holds under the strong assumptions and different (sub)sets of weak- 
assumptions. The property must hold if the intruder is not able to compromise all 
the instance-factors. G14 requires that a message is transmitted in an authenti- 
cated and fresh manner, thus allowing TreC to authenticate Patient and offering 
replay-protection at the same time. For the definition of authentication we refer 
to [20]: whenever the entity B completes a run of the protocol apparently with 
the entity A, then A has previously been running the protocol apparently with 
B, and the two entities agree on a message M. In ASLan++, this corresponds 
to specifying the goal 


(G14) SP_authn_U_on Request: (_) Patient *->> TreC; 


where *->> indicates authenticity, directedness (i.e., the only (honest) receiver 
of a message is the intended one [11]) and freshness. In addition, following the 
definition in [20], associated goal labels are used to specify which values of M the 
goal is referring to, namely, the Request value in State 1 of the Patient process 
(in Fig.3) and the corresponding value in the last state of the TreC process 
(State 3 in Fig. 3). 
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Similarly, the OTP properties are checked by means of the goal 
(G2) IDP_authn_UA_on_OTP: (_) OTP-PAT *->> ADC; 


with the associated goal labels specifying for M the values otp_generation 
(Seed, Time) in States 3 of both the OTP-PAT and the ADC processes in Fig. 3, 
where we have modeled Seed as a constant value shared between OTP-PAT and 
ADC, and Time as a session parameter (cf. [16]) shared between OTP-PAT and 
ADC. Thus, ADC will accept only one OTP value for each session, enforcing 
the property (informally described in Sect.3.3) that OTP is non-reusable. 


4.5 Results of the Security Analysis 


We are now ready to discuss the results of the security assessment that we 
have performed on the mHealth use-case. Our focus is determining whether the 
concurrent execution of a finite number of protocol sessions enjoys the expected 
security goals in spite of the intruder. To this aim, we have mechanically analyzed 
the formal model of our use-case using SATMC, a state-of-the-art model checker 
for security protocols. SATMC carries out an iterative deepening strategy on 
k. Initially k is set to 0, and then it is incremented till an attack is found (if 
any) or kmaz is reached. If this is the case, no attack traces of length up to 
kmar exist, where the length of the trace is computed by taking into account 
the parallel execution of non-conflicting actions (actions executed in parallel are 
considered as a single step). The trace includes the actions performed by attacker 
and honest participants, where most of the actions of the attacker are executed 
in parallel (and counted as a single step) with the ones of honest participants. 
We set kmaz to 1.5 times the length of the longest trace of the protocol when 
only honest entities participate. As a rule of thumb, with this choice we are 
reasonably confident that no attack is possible with greater values of king. In 
our analysis, the length of the longest trace of the protocol when only honest 
entities participate is 19, and thus we have set kma: = 30. We have considered 
several scenarios including (at most) three parallel sessions in which the intruder 
either does not play any role or plays the role of SPg (the TreC app in the use- 
case). In each session, we used different instances of the channels. The complete 
set of specifications can be found at the companion website. 

In Sect.3.4, in relation to the security goal Glyr, (and consequently to 
Gla), we have described a list of strong and weak assumptions that we have 


Table 2. Analyses performed for G14. 


Analysis | Strong Asm(s) | Weak Asm(s) Atk 
1 all —1 all Yes 
2 all all —1 No 
3 all all —m (1 <m < 4) |* 
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added to the model to constrain the intruder’s abilities. Table 2 summarizes the 
security analyses that we have performed to check this goal. 

Regarding the strong assumptions (TA, ComAl, ComA2, ComA3 and 
ActivA), we have performed the following analyses: 


Analysis 1: We have checked that by removing only one of the five strong 


assumptions from the model we have a violation of G14 (i.e., there 
is an attack). For this analysis, we have thus performed 5 execu- 
tions of SATMC removing one strong assumption at a time. To 
provide an example of an attack, Fig.4 shows the attack trace 
deriving from removing ComA2. In this attack, i can impersonate 
trec simply because the channel used to exchange its identity is 
not authentic; thus, i can pretend to be another app. Note that, 
for the sake of clarity, this figure (and, similarly, the other figures 
shown in this section) represents only the significant steps of the 
attack traces found by the SATMC tool.’ 


Regarding the weak assumptions (BA1, BA2, UBA1, and UBA2), we have per- 
formed the following analyses that are detailed in Table 3: 


Analysis 2: We have checked that by removing only one of the four weak 


assumptions from the model, SATMC does not find any attack 
on the solution (i.e., the intruder is not able to impersonate the 
user). Indeed, as shown in Table3, by removing only one weak 
assumption, the intruder obtains only 1 or 2 instance-factors. 


Analysis 3: We have checked that by removing specific subsets of weak assump- 


i(trec) 


+ 


Eet 


ppat) 


tions it is possible to compromise all the instance-factors, causing 
a violation of G14. In Table 2, the star (*) denotes that the result 


:patient treg :adc 
> 
fP—trec 
A keyhash = 
metadata 
t—pinRequest— 
‘ pinUser 
otp generatfion(seedotp, n(Cfime 1)).trec.keyhash.token [dp 
= = en 
‘ {adc.paftient.trec} invy(pk (adc) )—— 
———— = 
{adc.patient.trec} inv (pk (adc) ) 
[—{adc.patifent.trec} inv(pki(adc) ) 


Fig. 4. Attack trace without the strong assumption ComA2. 


3 The original charts can be examined on the companion website https://st.fbk.eu/ 
publications/POST-2018. 
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Table 3. Results for G14 (Analyses 2 and 3). 


Removed weak Asm(s) Compromised factors Atk 
PIN {seed} PIN | token IdP 
BAI x No 
BA2 v x x No 
UBA1 v x x No 
UBA2 x v v No 
(UBA1 V BA2) A BA1 v v v Yes 
(UBA1 V BA2) A UBA2 v v v Yes 


can be “yes” or “no” depending on the chosen subset of weak 
assumptions. The subsets shown in Table 3 violate G14 and result 
in different attack traces. Figure 5 shows the attack trace deriving 
from removing UBA1 and UBA2 (e.g., a proximity intruder that 
watches the PIN entered by Patient and then steals the smart- 
phone). In the attack, i initiates a session of the protocol with 
trec pretending to be patient (indicated as i(patient)). By 
entering the PIN code (pinUser) when requested by otppat, i 
is able to impersonate the patient and obtaining the requested 
resource (resources1). Figure6 shows the attack trace deriving 
from removing both BA1 and BA2 (e.g., a hacker that steals the 
PIN typed by Patient using a keylogger and reads token_IdP and 
{|seed|}_PIN exploiting a malware installed on the smartphone). 
In this case, i is able to generate an OTP and sends a token request 


to adc. 
va | :trec [:patient :otppat :adc 
i(patient) LL 
— reguestI epee 
| —————keyhash 
i (patient) ; metadata’ 
ai patient <q—— 
pinUser 


otp_generation(seedotp,n(CTime_1)).trec.keyhash.token_idp 


{adc.patient.trec} inv(pk(adc) ) 
pee a 


{adc.patient.trec}_inv (pk (adc) ) 


Fig. 5. Attack trace obtained removing UBA1 and UBA2. 
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requestl 
otppat<— 
otp_generation(seedotp,n(CTime 1)).trec.keyhash.token idp 


_ 
{adc.patient.trec} inv (pk (adc) ) 


{adc.patient.trec} inv (pk (adc) 


Fig. 6. Attack trace obtained removing BA1 and BA2. 


As expected, when checking the solution w.r.t. the security goal G2—which 
embodies the OTP properties—under all the (weak and strong) assumptions, 
SATMC does not find any attack. 


5 Related Work 


OAuth 2.0 [21] and OpenID Connect [22] have been designed for light-RESTful 
API services, and are considered the de-facto standards for managing authenti- 
cation and authorization. These protocols are well-accepted in the web scenario, 
but they provide only partial support for mobile apps (frequent use of the expres- 
sion “out of scope”). This could lead to the implementation of insecure solutions. 
An in-depth analysis of OAuth in the mobile environment—underlining possible 
security problems and vulnerabilities—is available in [23,24]. 

Given the lack of specifications, the OAuth Working Group has released in 
2017 a best practice with the title “OAuth 2.0 for Native Apps” [25]. The spec- 
ification of [25] has two main differences with respect to our solution: the choice 
of UA (browser vs native app) and the activation phase. The authors of [25] do 
not described any security issues in using native apps as UA; they discourage 
this because of the overhead on users to install a dedicated app. Nevertheless, 
in some scenarios, we consider this to be an advantage rather than a drawback 
because it allows for easily integrating new security mechanisms (e.g., access 
control and a wider range of MFA solutions). Concerning the activation phase of 
our solution, it allows for better mitigation of phishing as users directly interact 
with our app. Instead, [25] requires a redirection from a (possible malicious) 
SPc to a browser, thus users can be cheated by a fake browser invoked by SPc. 
We want to underline that, as described in [7], our solution is not designed from 
scratch but on top of Facebook; and the formalization that we have presented 
in this work can be easily extended to also analyze the OAuth solution of [25]. 

Much research has been carried out to discover vulnerabilities in different 
implementations of OAuth 2.0 and OpenID Connect in web and mobile scenar- 
ios. For instance, Sun et al. [26] analyzed hundreds of OAuth apps focusing on 
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classical web attacks such as Cross-Site Scripting (XSS) and Cross-Site Request 
Forgery (CSRF). Other studies, such as [27,28], analyzed the implementations 
of multi-party web apps via browser-related messages. In the context of mobile 
apps, a similar work is described in [29], where Yang et al. discovered an incor- 
rect use of OAuth that permits an intruder to login as a victim without the 
victim’s awareness. To evaluate the impact of this attack, they have shown that 
more than 40% of 600 top-ranked apps were vulnerable. 

Although these techniques are useful for the analysis of a specific implemen- 
tation (as they are able to discover serious security flaws), it is important to 
perform a comprehensive security analysis of the standard itself. In the context 
of web apps, Fett et al. [30] performed a formal analysis of the OAuth protocol 
using an expressive web model (defined in [31]) that describes the interaction 
between browsers and servers in a real-world set-up. This formal analysis revealed 
two unknown attacks on OAuth that violate the authorization and authentica- 
tion properties. A similar analysis is performed for OpenID Connect in [32]. Two 
other examples of formalizations of OAuth are [33], where the different OAuth 
flows are modeled in the Applied Pi calculus and verified using ProVerif extended 
with WebSpi (a library that models web users, apps and intruders), and [34], 
where OAuth is modeled in Alloy. 

In our analysis (cf. Sect.4) we used ASLan++ and SATMC. In the past, 
SATMC has revealed severe security flaws in the SAML 2.0 protocol [15] and in 
the variant implemented by Google [18]; by exploiting these flaws a dishonest 
service provider could impersonate a user at another service provider. Moreover, 
Yan et al. [35] used ASLan++ and SATMC to analyze four security properties 
of OAuth: confidentiality, authentication, authorization, and consistency. 

The aforementioned formal analyses, however, focus on the web app scenario, 
whereas in this paper we deal with native apps. In [36], Ye et al. used Proverif 
to analyze the security of a SSO implementation for Android. They applied 
their approach to the implementation of the Facebook Login and identified a 
vulnerability that exploits super user (SU) permissions. In contrast, our analysis 
assumes that the user smartphone cannot be rooted. Indeed, if a malicious app 
is able to obtain a SU permission, then it can set for itself the permission to 
access all the data stored in the smartphone, compromising all the user data 
and the tokens of the other apps installed on the rooted smartphone. 

Yubikey NEO [37] is one of the most attractive mobile identity management 
products on the market. It is a token device that supports OTPs and the FIDO 
Alliance Universal 2nd Factor (U2F) protocol, and, by integrating an NFC (Near 
Field Communication) technology, it can be used to provide a second-factor 
also in the mobile context. Compared to this product, our solution provides a 
multi-factor authentication solution for native mobile apps without requiring an 
additional device. 


6 Conclusions 


We have presented the design of mID(OTP), a multi-factor authentication solu- 
tion for native mobile apps that includes an OTP exchange and provides a 
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SSO experience. In addition to the protocol flow, we have detailed the security 
assumptions and defined two security goals: Gl mpra related to a multi-factor 
authentication solution and G2 that identifies the properties of a OTP. To per- 
form a security analysis of mID(OTP), we have detailed the OTP-generation 
approach in the context of a real use-case scenario (TreC). We have formally 
modeled the flow, assumptions and goals of TreC using a formal language 
(ASLan++) and checked the identified security goals using a model-checker 
(SATMC). 

The solution we have presented, as well as the formal specification and anal- 
ysis that we have given, can be generalized quite straightforwardly to other 
use-cases, which we are currently doing. As future work, we also plan to extend 
the analysis to other authentication factors, such as biometric traits. In addition, 
we started exploring an alternative formalization of multi-factor authentication 
protocols that decomposes the protocol and models the authentication property 
as a composition of two goals: one related to basic authentication (involving 
User, UA, SPc and IdPs) and one related only to the generation and valida- 
tion of the OTP (without involving SPc). In this way, a proper separation is 
kept between the multi-factor authentication performed with IdPs and the basic 
authentication plus SSO experience offered to SPg. As a preliminary analysis, 
we can affirm that the two different definitions of goals lead to similar attack 
traces. 
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Abstract. Albeit the primary usage of Bitcoin is to exchange currency, 
its blockchain and consensus mechanism can also be exploited to securely 
execute some forms of smart contracts. These are agreements among 
mutually distrusting parties, which can be automatically enforced with- 
out resorting to a trusted intermediary. Over the last few years a variety 
of smart contracts for Bitcoin have been proposed, both by the aca- 
demic community and by that of developers. However, the heterogeneity 
in their treatment, the informal (often incomplete or imprecise) descrip- 
tions, and the use of poorly documented Bitcoin features, pose obstacles 
to the research. In this paper we present a comprehensive survey of smart 
contracts on Bitcoin, in a uniform framework. Our treatment is based 
on a new formal specification language for smart contracts, which also 
helps us to highlight some subtleties in existing informal descriptions, 
making a step towards automatic verification. We discuss some obstacles 
to the diffusion of smart contracts on Bitcoin, and we identify the most 
promising open research challenges. 


1 Introduction 


The term “smart contract” was conceived in [43] to describe agreements between 
two or more parties, that can be automatically enforced without a trusted 
intermediary. Fallen into oblivion for several years, the idea of smart contract 
has been resurrected with the recent surge of distributed ledger technologies, 
led by Ethereum (http://www.ethereum.org/) and Hyperledger (https://www. 
hyperledger.org/). In such incarnations, smart contracts are rendered as com- 
puter programs. Users can request the execution of contracts by sending suitable 
transactions to the nodes of a peer-to-peer network. These nodes collectively 
maintain the history of all transactions in a public, append-only data structure, 
called blockchain. The sequence of transactions on the blockchain determines the 
state of each contract, and, accordingly, the assets of each user. 

A crucial feature of smart contracts is that their correct execution does not 
rely on a trusted authority: rather, the nodes which process transactions are 
assumed to be mutually untrusted. Potential conflicts in the execution of con- 
tracts are resolved through a consensus protocol, whose nature depends on the 
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specific platform (e.g., it is based on “proof-of-work” in Ethereum). Ideally, con- 
tracts execute correctly whenever the adversary does not control the majority 
of some resource (e.g., computational power for “proof-of-work” consensus). 

The absence of a trusted intermediary, combined with the possibility of 
transferring money given by blockchain-based cryptocurrencies, creates a fertile 
ground for the development of smart contracts. For instance, a smart contract 
may promise to pay a reward to anyone who provides some value that satisfies 
a given public predicate. This generalises cryptographic puzzles, like breaking a 
cipher, inverting a hash function, etc. 

Since smart contracts handle the ownership of valuable assets, attackers may 
be tempted to exploit vulnerabilities in their implementation to steal or tamper 
with these assets. Although analysis tools [17,30,34] may improve the security 
of contracts, so far they have not been able to completely prevent attacks. For 
instance, a series of vulnerabilities in Ethereum contracts [10] have been exploited, 
causing money losses in the order of hundreds of millions of dollars [3-5]. 

Using domain-specific languages (possibly, not Turing-complete) could help 
to overcome these security issues, by reducing the distance between contract 
specification and implementation. For instance, despite the discouraging limi- 
tations of its scripting language, Bitcoin has been shown to support a variety 
of smart contracts. Lotteries [6,14, 16,36], gambling games [32], contingent pay- 
ments [13,24,35], and other kinds of fair multi-party computations [8,31] are 
some examples of the capabilities of Bitcoin as a smart contracts platform. 

Unlike Ethereum, where contracts can be expressed as computer programs 
with a well-defined semantics, Bitcoin contracts are usually realised as crypto- 
graphic protocols, where participants send/receive messages, verify signatures, 
and put/search transactions on the blockchain. The informal (often incomplete 
or imprecise) narration of these protocols, together with the use of poorly doc- 
umented features of Bitcoin (e.g., segregated witnesses, scripts, signature mod- 
ifiers, temporal constraints), and the overall heterogeneity in their treatment, 
pose serious obstacles to the research on smart contracts in Bitcoin. 


Contributions. This paper is, at the best of our knowledge, the first systematic 
survey of smart contracts on Bitcoin. In order to obtain a uniform and precise 
treatment, we exploit a new formal model of contracts. Our model is based on a 
process calculus with primitives to construct Bitcoin transactions, to put them 
on the blockchain, and to search the blockchain for transactions matching given 
patterns. Our calculus allows us to give smart contracts a precise operational 
semantics, which describes the interactions of the (possibly dishonest) partici- 
pants involved in the execution of a contract. 

We exploit our model to systematically formalise a large portion of the con- 
tracts proposed so far both by researchers and Bitcoin developers. In many cases, 
we find that specifying a contract with the intended security properties is sig- 
nificantly more complex than expected after reading the informal descriptions 
of the contract. Usually, such informal descriptions focus on the case where all 
participants are honest, neglecting the cases where one needs to compensate for 
some unexpected behaviour of the dishonest environment. 
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Overall, our work aims at building a bridge between research communities: 
from that of cryptography, where smart contracts have been investigated first, 
to those of programming languages and formal methods, where smart contracts 
could be expressed using proper linguistic models, supporting advanced analysis 
and verification techniques. We outline some promising research perspectives on 
smart contracts, both in Bitcoin and in other cryptocurrencies, where the synergy 
between the two communities could have a strong impact in future research. 


2 Background on Bitcoin Transactions 


In this section we give a minimalistic introduction to Bitcoin [21,38], focussing on 
the crucial notion of transaction. To this purpose, we rely on the model of Bitcoin 
transactions in [11]. Here, instead of repeating the formal machinery of [11], we 
introduce the needed concepts through a series of examples. We will however 
follow the same notation of [11], and point to the formal definitions therein, to 
allow the reader to make precise the intuitions provided in this paper. 

Bitcoin is a decentralised infrastructure to securely transfer currency (the 
bitcoins, B) between users. Transfers of bitcoins are represented as transactions, 
and the history of all transactions is stored in a public, append-only, distributed 
data structure called blockchain. Each user can create an arbitrary number of 
pseudonyms through which sending and receiving bitcoins. The balance of a user 
is not explicitly stored within the blockchain, but it is determined by the amount 
of unspent bitcoins directed to the pseudonyms under her control, through one 
or more transactions. The logic used for linking inputs to outputs is specified by 
programmable functions, called scripts. 

Hereafter we will abstract from a few technical details of Bitcoin, e.g. the 
fact that transactions are grouped into blocks, and that each transaction must 
pay a fee to the “miner” who appends it to the blockchain. We refer to [11] for a 
discussion on the differences between the formal model and the actual Bitcoin. 


2.1 Transactions 


In their simplest form, Bitcoin transactions allow to transfer bitcoins from one 
participant to another one. The only exception are the so-called coinbase trans- 
actions, which can generate fresh bitcoins. Following [11], we assume that there 
exists a single coinbase transaction, the first one in the blockchain. We represent 
this transaction, say To, as follows: 


To 


in: L 
wit: L 
out: (Av. x < 51, 1B) 


The transaction To has three fields. The fields in and wit are set to L, meaning 
that To does not point backwards to any other transaction (since To is the first 
one on the blockchain). The field out contains a pair. The first element of the 
pair, Av. < 51, is a script, that given as input a value x, checks if x < 51 
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(this is just for didactical purposes: we will introduce more useful scripts in a 
while). The second element of the pair, 1B, is the amount of currency that can 
be transferred to other transactions. 

Now, assume that participant A wants to redeem 1B from To, and transfer 
that amount under her control. To do this, A has to append to the blockchain a 
new transaction, e.g.: 


Ta 


in: To 
wit: 42 
out: (Ax.versig,,, (x), 1B) 


The field in points to the transaction Tg in the blockchain. To be able to 
redeem from there 1B, A must provide a witness which makes the script within 
To.out evaluate to true. In this case the witness is 42, hence the redeem succeeds, 
and To is considered spent. The script within T,.out is the most commonly used 
one in Bitcoin: it verifies the signature x with A’s public key. The message against 
which the signature is verified is the transaction! which attempts to redeem T4. 

Now, to transfer 1B to another participant B, A can append to the blockchain 
the following transaction: 


Te 


in: Ta 
wit: sigp, (Ts) 
out: (Aw.versig,,. (£), 1B) 


where the witness sig;,,, (Tg) is A’s signature on Tg (but for the wit field itself). 
The ones shown above represent just the simplest cases of transactions. More 
in general, a Bitcoin transaction can collect bitcoins from many inputs, and split 
them between one or more outputs; further, it can use more complex scripts, 
and specify time constraints on when it can be appended to the blockchain. 
Following [11], hereafter we represent transactions as tuples of the form 
(in, wit, out, absLock, relLock), where: 


— in contains the list of inputs. An input (T,7) refers to the i-th output of 
transaction T. 

— wit contains the list of witnesses, of the same length as the list of inputs. For 
each input (T,7) in the in list, the witness at the same index must make the 
i-th output script of T evaluate to true. 

— out contains the list of outputs. Each index refers to a pair (Az.e,v), where 
the first component is a script, and the second is a currency value. 

— absLock and relLock indicate absolute and relative time constraint on when 
the transaction can be added to the blockchain. 


In transaction fields, we represent a list 41 --- én as 1 > 41,..., n |> bn, or just as 
lı when n = 1. We denote with T? the canonical transaction, i.e. the transaction 
with a single output of the form (Ac.versig;, (S), vB), and with all the other fields 
empty (denoted with L). 


1 Actually, the signature is not computed on the whole redeeming transaction, but 
only on a part of it, as shown in Sect. 2.3. 


Sok: Unraveling Bitcoin Smart Contracts 221 


Ti T2 Tg 
in: +++ in: 1> (T1, 1) in: Lbs (T1,2) 2 > (To, 1) 
wit: --- wit: 1 > oj wit: LH 02, 03 2 03 
out: Lb (Ax .versig, (£), v1) |out: 1 =œ (Ar.e2, v1 B)| Jout: 1 > (Ax .e3, (vi + v2)B) 
2 (Ax, x'.e1, v2B) relLock: 1H t absLock: t’ 


Fig. 1. Three Bitcoin transactions. 


Example 1. Consider the transactions in Fig. 1. In Tı there are two outputs: 
the first one transfers v;B to any transaction T’ which provides as witness a 
signature of T’ with key k; the second output can transfer v2B to a transaction 
whose witness satisfies the script e1. The transaction Tə tries to redeem vB from 
the output at index 1 of T1, by providing the witness c1. Since T2.relLock(1) = t, 
then Tə can be appended only after at least t time units have passed since the 
transaction in T .in(1) (i.e., T1) appeared on the blockchain. In T3, the input 1 
refers to the output 2 of T4, and the input 2 refers to the output 1 of T2. The 
witness o2 and of are used to evaluate T;.out(2), replacing the occurrences of x 
and a’ in e1. Similarly, a3 is used to evaluate T2.out(1), replacing the occurrences 
of x in e2. The transaction T; can be put on the blockchain only after time t. 


2.2 Scripts 


In Bitcoin, scripts are small programs written in a non-Turing equivalent lan- 
guage. Whoever provides a witness that makes the script evaluate to “true”, can 
redeem the bitcoins retained in the associated (unspent) output. In the abstract 
model, scripts are terms of the form Az.e, where z is a sequence of variables 
occurring in e, and e is an expression with the following syntax: 


en=ax | k | e+e | e—e | e=e | e<e | ifetheneelsee | 
le| | H(e) | versig,(e) | absAftert:e | relAfter t:e 


Besides variables x, constants k, and basic arithmetic/logical operators, the 
other expression are peculiar: |e| denotes the size, in bytes, of the evaluation of 
e; H(e) evaluates to the hash of e; versig, (e) evaluates to true iff the sequence of 
signatures e (say, of length m) is verified by using m out of the n keys in k. For 
instance, the script Ax .versigą (x) is satisfied if x is a signature on the redeem- 
ing transaction, verified with the key k. The expressions absAfter t : e and 
relAfter t : e define absolute and relative time constraints: they evaluate as e if 
the constraints are satisfied, otherwise they evaluate to false. 

In Fig. 2 we recap from [11] the semantics of script expressions. The function 
[1 ,:,. takes three parameters: T is the redeeming transaction, i is the index 
of the redeeming witness, and p is a map from variables to values. We use L 
to represent the “failure” of the evaluation, H for a public hash function, and 
size(n) for the size (in bytes) of an integer n. The function ver, (o, T, i) verifies 
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[thio =pl)  Uklrio =k — leoe’ Jr io = [elt ao 01 lelro (0 € (+, -, =, <}) 
[if eo then e1 else e2]T ip = if Jeo]t, ¿p then [ei]t i, else [e2|1,i, 
[lel] ip = stze(Lelr,i,0) [H(e)]1.10 = 4 (Lelt,:,o) Iversig, (€)]1,1,.0 = vere (lelt +p, T, i) 
JabsAfter t : e]t,i,. = if T.absLock > t then felt i, else L 


[relAfter t : eļt ip = if T.relLock(z) > t then [e]r,i,p else L 


Fig. 2. Semantics of script expressions. 


a sequence of signatures ø against a sequence of keys k (see Sect. 2.3) All the 
semantic operators used in Fig. 2 are strict, i.e. they evaluate to L if some of 
their operands is L. We use syntactic sugar for expressions, e.g. false denotes 
1 = 0, true denotes 1 = 1, while e and e’ denotes if e then e’ else false. 


Example 2. Recall the transactions in Fig. 1. Let e; (the script expression within 
T,.0ut(2)) be defined as e = absAfter t’ : versigp (x) and H(x’) = h, for h and t’ 
constants such that T3.absLock > t. Further, let o2 and g4 (the witnesses within 
T3.wit(1)) be respectively sig,(T3) and s, where sig,(T3) is the signature of Ts 
(excluding its witnesses) with key k, and s is a preimage of h, i.e. h = H(s). Let 
p = {x = sig,(T3),a”’ + s}. To redeem T,.out(2) with the witness T3.wit(1), 
the script expression is evaluated as follows: 


[absAfter t’ : versigąy (x) and H(x’) = Alt, 1, 

= [versig, (x) and H(z’) = h]lt,,1,5 as T3.absLock > t 
= [versig,(2)] 13,1, A [H(2’) = Alts,1,5 

= verk (pls), T3,1) A ([H(2’)]15,1,9 = [Al ts,1,) 

= verg(sigg(T3), 13,1) A (H(p(2’)) = h) as p(x) = sig, (Ts) 
= true as p(x’) = 8 


(x 
(x 


2.3 Transaction Signatures 


The signatures verified with versig never apply to the whole transaction: the 
content of wit field is never signed, while the other fields can be excluded from the 
signature according to some predefined patterns. To sign parts of a transaction, 
we first erase the fields which we want to neglect in the signature. Technically, 
we set these fields to the “null” value L using a transaction substitution. 

A transaction substitution {f +> d} replaces the content of field f with d. If 
the field is indexed (i.e., all fields but absLock), we denote with {f (i) > d} the 
substitution of the i-th item in field f, and with {f (# i) > d} the substitution 
of all the items of field f but the i-th. For instance, to set all the elements of 
the wit field of T to L, we write T{wit > L}, and to additionally set the second 
input to L we write T{witr> L}{in(2) = L}. 

In Bitcoin, there exists a fixed set of transaction substitutions. We represent 
them as signature modifiers, i.e. transaction substitutions which set to L the 
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fields which will not be signed. Signatures never apply to the whole transaction: 
modifiers always discard the content of the wit, while they can keep all the 
inputs or only one, and all the outputs, or only one, or none. Modifiers also 
take a parameter i, which is instantiated to the index of the witness where the 
signature will be included. Below we only present two signature modifiers, since 
the others are not commonly used in Bitcoin smart contracts. 

The modifier aa; only sets the first witness to 7, and the other witnesses 
to L (so, all inputs and all outputs are signed). This ensures that a signature 
computed for being included in the witness at index 7 can not be used in any 
witness with index j +Æ i: 


aa;(T) = T{wit(1) > i}{wit(# 1) > L} 


The modifier sa; removes the witnesses, and all the inputs but the one at 
index i (so, a single input and all outputs are signed). Differently from aa;, this 
modifier discards the index i, so the signature can be included in any witness: 


sa;(T) = aaz(T {wit = L}{in(1) > T.in(z)}{in(4 1) H L} 
{relLock(1) — T.relLock(é)}{relLock(¢ 1) > L}) 


Signatures carry information about which parts of the transaction are signed: 
formally, they are pairs ø = (w, u), where u is the modifier, and w is the signature 
on the transaction T modified with u. We denote such signature as sigt” (T), 
where k is a key, and 7 is the index used by yp, if any. Verification of a signature 
o for index i is denoted by ver, (a, T, i). Formally: 


sign” (T) = (siga (m(T)) u) vera (o,T, 4) = vere(w, mi(T)) if o = (w, u) 


where sig and ver are, respectively, the signing function and the verification 
function of a digital signature scheme. 

Multi-signature verification ver,(o,T,7) extends verification to the case 
where o is a sequence of signatures and k is a sequence of keys. Intuitively, 
if |o| = m and |k| = n, it implements a m-of-n multi-signature scheme, evalu- 
ating to true if all the m signatures match (some of) the keys in k. The actual 
definition also takes into account the order of signatures, as formalised in Defi- 
nition 6 of [11]. 


2.4 Blockchain and Consistency 


Abstracting away from the fact that the actual Bitcoin blockchain is formed by 
blocks of transactions, here we represent a blockchain B as a sequence of pairs 
(T;, ti), where t; is the time when T; has been appended, and the values t; are 
increasing. We say that the j-th output of the transaction T; in the blockchain 
is spent (or, for brevity, that (T;,7) is spent) if there exists some transaction T; 
in the blockchain (with 7’ > i) and some j’ such that T,.in(j’) = (T;, 9). 

We now describe when a pair (T,t) can be appended to B = 
(To, to): -+ (Tn, tn). Following [11], we say that T is a consistent update of B 
at time t, in symbols B > (T,t), when the following conditions hold: 
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1. for each input i of T, if T.in(i) = (T’,7) then: 
(a) T’ corresponds to one of the transactions in B; 
(b) (T’, 7) is unspent in B; 
(c) the witness T.wit(i) makes the script in T’.out(j) evaluate to true; 
2. the time constraints absLock and relLock in T are satisfied at time t > tn; 
3. the sum of the amounts of the inputs of T is greater or equal? to the sum of 
the amount of its outputs. 


We assume that each transaction T; in the blockchain is a consistent update of 
the sequence of past transactions To -+-+ T;—1. The consistency of the blockchain 
is actually ensured by the Bitcoin consensus protocol. 


Example 3. Recall the transactions in Fig. 1. Assume a blockchain B whose last 
pair is (T,,¢1) and tı > t’, while Tz and T; are not in B. 

We verify that (T2,t2) is a consistent update of B, assuming tg = tı + t and 
that o1 is the signature of Tə with (the private part of) key k. The only input 
of Tə is (T;,1). Conditions la and 1b are satisfied, since (T1,1) is unspent in 
B. Condition 1c holds because versig; (01) evaluates to true. Condition 2 holds: 
indeed the relative timelock in Tə is satisfied because tz — tı > t. Condition 3 
holds because the amount of the input of To, i.e. v1 B, is equal to the amount 
of its output. Note instead that (T3,t2) would not be a consistent update of B, 
since it violates condition la on the second input. 

Now, let B’ = B(T2,t2). We verify that (T3,t3) is a consistent update of 
B’, assuming t3 > t2, e} as in Example2, and e2 = versig,,(x). Further, let 
o2 = sig,(T3), let of = s, and o3 = sig,,(T3). Conditions la and 1b hold, 
because T; and Ts are in B’, and the referred outputs are unspent. Condition 1c 
holds because the output scripts T;.out(2) and T,.out(1) against o2,04 and o3 
evaluate to true. Condition 2 is satisfied at t3 > t2 > tı > t’. Finally, condition 3 
holds because the amount (vı + v2)B in T3.out(1) is equal to the sum of the 
amounts in T;.out(2) and T2.out(1). 


3 Modelling Bitcoin Contracts 


In this section we introduce a formal model of the behavior of the participants 
in a contract, building upon the model of Bitcoin transactions in [11]. 

We start by formalising a simple language of expressions, which represent 
both the messages sent over the network, and the values used in internal com- 
putations made by the participants. Hereafter, we assume a set Var of variables, 
and we define the set Val of values comprising constants k € Z, signatures g, 
scripts Az.e, transactions T, and currency values v. 


? The difference between the amount of inputs and that of outputs is the fee paid to 
the miner who publishes the transaction. 
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Fig. 3. Semantics of contract expressions. 


Definition 1 (Contract expressions). We define contract expressions 
through the following syntax: 


ET: =v value (v € Val) 
x variable (x € Var) 
sigt” (T) signature (u signature modifier) 
versigx (E, T, i) (multi) signature verification 
T{f (i) > E} transaction field update 
(E, E) pair 
E and E | E or E | not £ logical expressions 
E+E|.--- arithmetic expressions 


where E denotes a finite sequence of expressions (i.e., E = E -- - En). We define 
the function |-] from (variable-free) contract expressions to values in Fig. 8. As 
a notational shorthand, we omit the index i in sig (resp. versig) when the signed 
(resp. verified) transactions have a single input. 


Intuitively, when T evaluates to a transaction T, the expression T{f (7) => E} 
represents the transaction obtained from T by substituting the field f(i) with 
the sequence of values obtained by evaluating E. For instance, T {wit(1) > o} 
denotes the transaction obtained from T by replacing the witness at index 1 with 
the signature ø. Further, sig} ” (T) evaluates to the signature of the transaction 
represented by T, and versig; (E, T,i) represents the m-of-n multi-signature veri- 
fication of the transaction represented by T. Both for the signing and verification, 
the parameter 7 represents the index where the signature will be used. We assume 
a simple type system (not specified here) that rules out ill-formed expressions, 
like e.g. k{wit(1)  T}. 

We formalise the behaviour of a participant as an endpoint protocol, i.e. a 
process where the participant can perform the following actions: (i) send/receive 
messages to/from other participants; (ii) put a transaction on the ledger; 
(iii) wait until some transactions appear on the blockchain; (iv) do some internal 
computation. Note that the last kind of operation allows a participant to craft 
a transaction before putting it on the blockchain, e.g. setting the wit field to her 
signature, and later on adding the signature received from another participant. 
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Definition 2 (Endpoint protocols). Assume a set of participants (named 
A, B, C, ...). We define prefixes 7, and protocols P,Q, R,... as follows: 


Tmi=A!lE send messages to A 
Ata receive messages from A 
put T append transaction T to the blockchain 
ask Tas & wait until all transactions in T are on the blockchain 
check Æ test condition 
P: =” ;erTi.Pi guarded choice (I finite set) 
PIP parallel composition 
X(E) named process 


We assume that each name X has a unique defining equation X(x) = P where 
the free variables in P are included in x. We use the following syntactic sugar: 


— T =check true, the internal action; 

- 0 5P, the terminated protocol (as usual, we omit trailing Os); 

- if E then P else Q £ check E . P + check not E .Q; 

-7.Q,:+P4 Paerua Ti-Qi, provided that P = Xor Ti-Qi and 1 ¢ I; 
- let x = E in P £ P{E/e}, i.e. P where x is replaced by E. 


A 
A 


The behaviour of protocols is defined in terms of a LTS between systems, i.e. 
the parallel composition of the protocols of all participants, and the blockchain. 


Definition 3 (Semantics of protocols). A system S is a term of the form 
Ai [Pi] |---| An[Pn] | (B,t), where (i) all the A; are distinct; (ii) there exists a 
single component (B,t), representing the current state of the blockchain B, and 
the current time t; (iii) systems are up-to commutativity and associativity of |. 
We define the relation — between systems in Fig. 4, where matchg(T) is the set 
of all the transactions in B that are equal to T, except for the witnesses. When 
writing S | S” we intend that the conditions above are respected. 


Intuitively, a guarded choice )°;7;.P; can behave as one of the branches 
P;. A parallel composition P | Q executes concurrently P and Q. All the rules 
(except the last two) specify how a protocol (7.P + Q) | R evolves within a 
system. Rule [Com] models a message exchange between A and B: participant A 
sends messages Æ, which are received by B on variables x. Communication is 
synchronous, i.e. A is blocked until B is ready to receive. Rule [Cuecx] allows the 
branch P of a sum to proceed if the condition represented by F is true. Rule [Pur] 
allows A to append a transaction to the blockchain, provided that the update 
is consistent. Rule [Asx] allows the branch P of a sum to proceed only when 
the blockchain contains some transactions T} --- T, obtained by instantiating 
some L fields in T (see Sect. 2). This form of pattern matching is crucial because 
the value of some fields (e.g., wit), may not be known at the time the protocol 
is written. When the ask prefix unblocks, the variables x in P are bound to 
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[T] = T B> (T,t) 
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A[X(BE)|Q]|S 48 S| (B,t) Š S’| (B,t+t) 


Fig. 4. Semantics of endpoint protocols. 


T Ta Th 
in: (Ta, 1) in: (T, 1) in: (T, 1) 
wit: L wit: L wit: L 
out: (Acc’.versig,, ke (ss’), 1B)| lout: (Ac.versig,, (s), 1B)| lout: (Ac.versig,,, (s), 1B) 


Fig. 5. Transactions of the naïve escrow contract. 


T4- T}, so making it possible to inspect their actual fields. Rule [Der] allows a 
named process X(£) to evolve as P, assuming a defining equation X(a) = P. 
The variables x in P are substituted with the results of the evaluation of E. 
Such defining equations can be used to specify recursive behaviours. Finally, 


rule [Derav] allows time to pass®. 


Example 4 (Naive escrow). A buyer A wants to buy an item from the seller 
B, but they do not trust each other. So, they would like to use a contract to 
ensure that B will get paid if and only if A gets her item. In a naive attempt 
to realise this, they use the transactions in Fig. 5, where we assume that (Ta, 1) 
used in T.in, is a transaction output redeemable by A through her key ka. The 
transaction T makes A deposit 1B, which can be redeemed by a transaction 
carrying the signatures of both A and B. The transactions Th and Th redeem 
T, transferring the money to A or B, respectively. 
The protocols of A and B are, respectively, Pa and Qg: 


Pa = put T{wit > sigge (T)}. P’ 
P' = 7r.Bisigh'(Tg) + T.B? x. put Ta{wit > sigg (Ta) 2} 
Qe = ask T.(7.A?a. put Tg {wit > x sigf@(Tg)} + 7A! sigg?(Ta)) 
3 To keep our presentation simple, we have not included time-constraining operators 


in endpoint protocols. In case one needs a finer-grained control of time, well-known 
techniques [39] exist to extend a process algebra like ours with these operators. 
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First, A adds her signature to T, and puts it on the blockchain. Then, she inter- 
nally chooses whether to unblock the deposit for B or to request a refund. In the 
first case, A sends sigg, (Tg) to B. In the second case, she waits to receive the sig- 
nature sig} (T4) from B (saving it in the variable x); afterwards, she puts T} on 
the blockchain (after setting wit) to redeem the deposit. The seller B waits to see 
T on the blockchain. Then, he chooses either to receive the signature sig}, (Tg) 
from A (and then redeem the payment by putting Th on the blockchain), or to 
refund A, by sending his signature sig;"(T/ ). 

This contract is not secure if either A or B are dishonest. On the one hand, a 
dishonest A can prevent B from redeeming the deposit, even if she had already 
received the item (to do that, it suffices not to send her signature, taking the 
rightmost branch in P’). On the other hand, a dishonest B can just avoid to 
send the item and the signature (taking the leftmost branch in Qs): in this way, 
the deposit gets frozen. For instance, let S = A[Pa] | B[Qs]|(B,t), where B 
contains Ta unredeemed. The scenario where A has never received the item, 
while B dishonestly attempts to receive the payment, is modelled as follows: 


S — A[P’]| B[Qs] | (B(T,t),t) 
> A[P’]| B[r.A? a. put Tg {wit > < sigh’ (Te)} + 7-Alsigg’ (Ta) |-> 
— A[B? x. put Ta {wit > sight (Ta) x} | B[A ? z. put Tg {wit > z sigh’ (Ts) }] | -+ 


At this point the computation is stuck, because both A and B are waiting a 
message from the other participant. We will show in Sect. 4.3 how to design a 
secure escrow contract, with the intermediation of a trusted arbiter. 


4 A Survey of Smart Contracts on Bitcoin 


We now present a comprehensive survey of smart contracts on Bitcoin, com- 
prising those published in the academic literature, and those found online. To 
this aim we exploit the model of computation introduced in Sect. 3. Remarkably, 
all the following contracts can be implemented by only using so-called standard 
transactions“, e.g. via the compilation technique in [11]. This is crucial, because 
non-standard transactions are currently discarded by the Bitcoin network. 


4.1 Oracle 


In many concrete scenarios one would like to make the execution of a contract 
depend on some real-world events, e.g. results of football matches for a betting 
contract, or feeds of flight delays for an insurance contract. However, the evalua- 
tion of Bitcoin scripts can not depend on the environment, so in these scenarios 
one has to resort to a trusted third-party, or oracle [2,19], who notifies real-world 
events by providing signatures on certain transactions. 

For example, assume that A wants to transfer vB to B only if a certain 
event, notified by an oracle O, happens. To do that, A puts on the blockchain 


t https: //bitcoin.org/en/developer-guide#standard-transactions. 
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T Te 
in: (Ta, 1) in: (T, 1) 
wit: sig; (T) wit: L 
out: (Acc’.versig,, ko (SS), vB)| Jout: (As.versig,.. (s), vB) 


Fig. 6. Transactions of a contract relying on an oracle. 


the transaction T in Fig.6, which can be redeemed by a transactions carrying 
the signatures of both B and O. Further, A instructs the oracle to provide his 
signature to B upon the occurrence of the expected event. 

We model the behaviour of B as the following protocol: 


Pg = O?2. put TE {wit = sign, (Tg) x} 


Here, B waits to receive the signature sigg‘ (Th) from O, then he puts Th on the 
blockchain (after setting its wit) to redeem T. In practice, oracles like the one 
needed in this contract are available as services in the Bitcoin ecosystem?. 

Notice that, in case the event certified by the oracle never happens, the vB 
within T are frozen forever. To avoid this situation, one can add a time constraint 
to the output script of T, e.g. as in the transaction Tyonq in Fig. 10. 


4.2 Crowdfunding 


Assume that the curator C of a crowdfunding campaign wants to fund a venture 
V by collecting vB from a set {A;};ez of investors. The investors want to be 
guaranteed that either the required amount vB is reached, or they will be able 
to redeem their funds. To this purpose, C can employ the following contract. She 
starts with a canonical transaction T\, (with empty in field) which has a single 
output of vB to be redeemed by V. Intuitively, each A; can invest money in the 
campaign by “filling in” the in field of the Ty, with a transaction output under 
their control. To do this, A; sends to C a transaction output (T;, ji), together 
with the signature g; required to redeem it. We denote with val(T;, ji) the value 
of such output. Notice that, since the signature a; has been made on T the 
only valid output is the one of vB to be redeemed by V. Upon the reception 
of the message from A;, C updates Te: the provided output is appended to 
the in field, and the signature is added to the corresponding wit field. If all the 
outputs (T;,7;) are distinct (and not redeemed) and the signatures are valid, 
when >>, val(T;, ji) > v the filled transaction T can be put on the blockchain. 
If C collects v’ > vB, the difference v’ — v goes to the miners as transaction fee. 
The endpoint protocol of the curator is defined as LT, 1,0), where: 


X(x,n,d) = if d< v then P else put x 
P = VA? (y, j o). X(x{in(n) > (y, j) Hwit(n) => o},n +1,d + val(y, j)) 


5 For instance, https://www.oraclize.it and https://www.smartcontract.com/. 
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T Tp (2) 
in: (Ta, 1) in: (T, 1) 
wit: L wit: L 
out: (Acs! .versigg, koke (SS’), 1B)|/out: 1++ (Ac.versig,, (S), 2B), 2 = (Ac.versig,, (s), (1 — 2)B) 


Fig. 7. Transactions of the escrow contract. 


while the protocol of each investor A; is the following: 
Pa, = C! (Ta ja signe” (Ty fin(L) > (Ti, 4:)})) 


Note that the transactions sent by investors are not known a priori, so they 
cannot just create the final transaction and sign it. Instead, to allow C to com- 
plete the transaction T\, without invalidating the signatures, they compute them 
using the modifier sa. In this way, only a single input is signed, and when veri- 
fying the corresponding signature, the others are neglected. 


4.3 Escrow 


In Example 4 we have discussed a naïve escrow contract, which is secure only if 
both the buyer A and the seller B are honest (so making the contract pointless). 
Rather, one would like to guarantee that, even if either A or B (or both) are 
dishonest, exactly one them will be able to redeem the money: in case they 
disagree, a trusted participant C, who plays the role of arbiter, will decide who 
gets the money (possibly splitting the initial deposit in two parts) [1,19]. 

The output script of the transaction T in Fig. 7 is a 2-of-3 multi-signature 
schema. This means that T can be redeemed either with the signatures A and B 
(in case they agree), or with the signature of C (with key kc) and the signature of 
A or that of B (in case they disagree). The transaction T), (z) in Fig. 7 allows the 
arbiter to issue a partial refund of zB to A, and of (1—z)B to B. Instead, to issue 
a full refund to either A or B, the arbiter signs, respectively, the transactions 
Th = Th fin(1)  (T,1)} or TE = TE {in(1) = (T,1)} (not shown in the 
figure). The protocols of A and B are similar to those in Example 4, except for 
the part where they ask C for an arbitration: 


Pa = put T{wit > sigg’(T)}.(7.B!sigg’(Tg) + 7.P’) 

P' = (B?z. (put T, {wit > sigge (Ta) r} + P”)) + P” 

P” = C?(z,x). (check z = 1 . put Ty {wit > sigge (T4) £} 

+ check 0 < z < 1 . (put Thg(z){wit > sigge (The (z)) £} + 7.0) 
+ check z = 0 . 0) 


II 


In the summation within Pa, participant A internally chooses whether to 
send her signature to B (so allowing B to redeem 1B via Th), or to proceed with 
P’. There, A waits to receive either B’s signature (which allows A to redeem 1B 
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in: (Ta, uc) in: (Tas, 1) 
wit: L wit: L 
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Fig. 8. Transactions of the intermediated payment contract. 


by putting T% on the blockchain), or a response from the arbiter, in the process 
P”. The three cases in the summation of check in P” correspond, respectively, 
to the case where A gets a full refund (z = 1), a partial refund (0 < z < 1), or 
no refund at all (z = 0). 

The protocol for B is dual to that of A: 


Qs = ask T.(7.Alsigh?(T) + 7.Q’) 

Q' = (A? x. (put Ta {wit > z sigs (Tg)} + Q") + Q” 

Q” =C?(z,a). (check z = 0 . put Tg {wit > sigf*(T,) 2} 

+ check 0 < z < 1 . (put Tag (z){wit > sigge (The (2)) £} + 7.0) 
+ check z = 1 .0) 


If an arbitration is requested, C internally decides (through the 7 actions) 
who between A and B can redeem the deposit in T, by sending its signature to 
one of the two participants, or decide for a partial refund of z and 1 — z bitcoins, 
respectively, to A and B, by sending its signature on Thg to both participants: 

Rc = ask T. (T.A! (1,sigh?(Ta)) + 7-B!(1,sigg?(Tg)) + 7-Ras) 


Cc 


Ras = VocecrT-(A! (2, signe (Tas (2))) | B! (z, signe (Tas (2)))) 


Note that, in the unlikely case where both A and B choose to send their 
signature to the other participant, the 1B deposit becomes “frozen”. In a more 
concrete version of this contract, a participant could keep listening for the sig- 
nature, and attempt to redeem the deposit when (unexpectedly) receiving it. 


4.4 Intermediated Payment 


Assume that A wants to send an indirect payment of vc Ë to C, routing it through 
an intermediary B who retains a fee of ug < uc bitcoins. Since A does not trust 
B, she wants to use a contract to guarantee that: (i) if B is honest, then vcB 
are transferred to C; (ii) if B is not honest, then A does not lose money. The 
contract uses the transactions in Fig.8: Tag transfers (vg + vc)B from A to B, 
and Tgc splits the amount to B (vgB) and to C (vcB). We assume that (T4, 1) 
is a transaction output redeemable by A. The behaviour of A is as follows: 


Pa = (B? zx. if versigk (x, Tgc) then P’ else 0) + 7 
P' = put Tag {wit > sigg. (TaB) }. put Tac {wit > sigge (Tec) x} 
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Term Tapen Tag 
in: (Ta, 1) in: (Tests 1) in: eara; 1) 
wit: L wit: L wit: L 
_ (Aase’.(versig,,, (6) and H(z) = h) | [out: (Ac.versig,, (S), vB) | fout: (Ac.versig,, (s), vB) 
"or versigy.. p (SS') vB) relLock: t 


Fig. 9. Transactions of the timed commitment. 


Here, A receives from B his signature on Tgc, which makes it possible to 
pay C later on. The 7 branch and the else branch ensure that A will correctly 
terminate also if B is dishonest (i-e., B does not send anything, or he sends an 
invalid signature). If A receives a valid signature, she puts Tag on the blockchain, 
adding her signature to the wit field. Then, she also appends Tpc, adding to 
the wit field her signature and B’s one. Since A takes care of publishing both 
transactions, the behaviour of B consists just in sending his signature on Tegc. 
Therefore, B’s protocol can just be modelled as Qg = A! sigg‘ (Tsc). 

This contract relies on SegWit. In Bitcoin without SegWit, the identifier of 
Tag is affected by the instantiation of the wit field. So, when Tag is put on the 
blockchain, the input in Tgc (which was computed before) does not point to it. 


4.5 Timed Commitment 


Assume that A wants to choose a secret s, and reveal it after some time—while 
guaranteeing that the revealed value corresponds to the chosen secret (or paying 
a penalty otherwise). This can be obtained through a timed commitment [20], 
a protocol with applications e.g. in gambling games [25, 28,42], where the secret 
contains the player move, and the delay in the revelation of the secret is intended 
to prevent other players from altering the outcome of the game. Here we formalise 
the version of the timed commitment protocol presented in [8]. 

Intuitively, A starts by exposing the hash of the secret, i.e. h = H(s), and at 
the same time depositing some amount vB in a transaction. The participant B 
has the guarantee that after t time units, he will either know the secret s, or he 
will be able to redeem vB. 

The transactions of the protocol are shown in Fig.9, where we assume that 
(Ta, 1) is a transaction output redeemable by A. The behaviour of A is modelled 
as the following protocol: 


Pa = put Tcom{wit > sige’ (Tcom)}-B! sigge (T pay). P’ 
Pp 


T. put Topen{wit => s sigg, (T open) i}+r 


Participant A starts by putting the transaction T ¿om on the blockchain. Note 
that within this transaction A is committing the hash of the chosen secret: 
indeed, h is encoded within the output script Tcom-out. Then, A sends to B her 
signature on Tpay. Note that this transaction can be redeemed by B only when t 
time units have passed since To has been published on the blockchain, because 
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Toond T pay (v) Tref 
in: (Ta, 1) in: (Toona: 1) in: (Toonas 1) 
wit: L wit: L wit: L 
out: Oss! versign.. kg (ss’) or le (As.versig;,. (s), (k — v)B) | Jout: (As.versig,,, (s), vB) 
relAfter t : versig,, (S), kB) ane (Ac.versig,.(s), vB) relLock: t 


Fig. 10. Transactions of the micropayment channel contract. 


of the relative timelock declared in Typa,.rellock. After sending her signature 
on T pay, A internally chooses whether to reveal the secret, or do nothing (via the 
T actions). In the first case, A must put the transaction T open on the blockchain. 
Since it redeems T com, She needs to write in T open-wit both the secret s and her 
signature, so making the former public. 

A possible behaviour of the receiver B is the following: 


QB 
Q 


(A? x. if versigk, (x, Tpay) then Q else 0) + 7 
put T pay {wit =e Le sige. (T pay) } ar ask Topen as o. Q' (get secret (0)) 


In this protocol, B first receives from A (and saves in x) her signature on 
the transaction Tpay. Then, B checks if the signature is valid: if not, he aborts 
the protocol. Even if the signature is valid, B cannot put Tpay on the blockchain 
and redeem the deposit immediately, since the transaction has a timelock t. 
Note that B cannot change the timelock: indeed, doing so would invalidate A’s 
signature on Tpay. If, after t time units, A has not published T open yet, B can 
proceed to put Tpay on the blockchain, writing A’s and his own signatures in the 
witness. Otherwise, B retrieves T open from the blockchain, from which he can 
obtain the secret, and use it in Q’. 

A variant of this contract, which implements the timeout in Tcom.out, and 
does not require the signature exchange, is used in Sect. 4.7. 


4.6 Micropayment Channels 


Assume that A wants to make a series of micropayments to B, e.g. a small fraction 
of B every few minutes. Doing so with one transaction per payment would result 
in conspicuous fees®, so A and B use a micropayment channel contract [29]. A 
starts by depositing kB; then, she signs a transaction that pays vB to B and 
(k — v)B back to herself, and she sends that transaction to B. Participant B 
can choose to publish that transaction immediately and redeem its payment, or 
to wait in case A sends another transaction with increased value. A can stop 
sending signatures at any time. If B redeems, then A can get back the remaining 
amount. If B does not cooperate, A can redeem all the amount after a timeout. 

The protocol of A is the following (the transactions are in Fig. 10). A publishes 
the transaction Tona, depositing kB that can be spent with her signature and 
that of B, or with her signature alone, after time t. A can redeem the deposit by 


ë https: //bitinfocharts.com/comparison/bitcoin-transactionfees.html. 
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publishing the transaction Tef. To pay for the service, A sends to B the amount 
v she is paying, and her signature on Tpay(v). Then, she can decide to increase 
v and recur, or to terminate. 


Pa = put Toona{wit > sige’ (Toona) }. (P(1) | put Treg {wit > sige’ (Treg) }) 
P(v) = Bl(v, sige’ (T pay(v))). (r+ 7.P(v+1)) 


The participant B waits for Tona to appear on the blockchain, then receives 
the first value v and A’s signature ø. Then, B checks if ø is valid, otherwise he 
aborts the protocol. At this point, B waits for another pair (v’,o’), or, after a 
timeout, he redeems vB using Tyay(v). 


Qe = ask Trona. A? (v, 0). if versigx, (7, Tpay(v)) then P’ (v, o) else 7 
P'(v,0) = T.Ppay(v,o) + 
A? (v',0’). if v” >v and versigx, (o’, Tpay(v’)) then P’(v’, a’) else P’(v, a) 
Poay(v,o) = put Tpay(v){wit > 0 Sig ps (T pay (v))} 


Note that Qg should redeem Tpay before the timeout expires, which is not 
modelled in Qg. This could be obtained by enriching the calculus with time- 
constraining operators (see Footnote 3). 


4.7 Fair Lotteries 


A multiparty lottery is a protocol where N players put their bets in a pot, anda 
winner—uniformly chosen among the players—redeems the whole pot. Various 
contracts for multiparty lotteries on Bitcoin have been proposed in [8,9,12, 14, 
16,36]. These contracts enjoy a fairness property, which roughly guarantees that: 
(i) each honest player will have (on average) a non-negative payoff, even in the 
presence of adversaries; (ii) when all the players are honest, the protocol behaves 
as an ideal lottery: one player wins the whole pot (with probability 1/n), while 
all the others lose their bets (with probability N—1/n). 

Here we illustrate the lottery in [8], for N = 2. Consider two players A and 
B who want to bet 1B each. Their protocol is composed of two phases. The first 
phase is a timed commitment (as in Sect. 4.5): each player chooses a secret (s, 
and sg) and commits its hash (ha = H(s,) and hg = H(s,)). In doing that, 
both players put a deposit of 2B on the ledger, which is used to compensate the 
other player in case one chooses not to reveal the secret later on. In the second 
phase, the two bets are put on the ledger. After that, the players reveal their 
secrets, and redeem their deposits. Then, the secrets are used to compute the 
winner of the lottery in a fair manner. Finally, the winner redeems the bets. 

The transactions needed for this lottery are displayed in Fig. 11 (we only show 
A’s transactions, as those of B are similar). The transactions for the commitment 
phase (Tcom, T open; T pay) are similar to those in Sect. 4.5: they only differ in the 
script of T com-Out, which now also checks that the length of the secret is either 
128 or 129. This check forces the players to choose their secret so that it has one 
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Tacom(ha) Tiottery(ha, hg) 
in: (Thai 1) in: 1 > (TAbets 1), 2 > (TBbet; 1) 
wit: L wit: L 
(Ags.(versig,, (s) and H(x) = ha (Aszy.H(x) = ha and H(y) = he and 
out: and (|x| = 128 or |x| = 129)) | fout: (|x| = 128 or |x| = 129) and (|y| = 128 or |y| = 129) 
or absAfter t : versig,,.. (S), 2B) and if |x| = |y| then versig,, (s) else versig,,.. (s), 2B) 
Tropen(ha) Tapay (ha) Tawin (has he) 
in: (Tacom(ha), 1) in: (Tacom(ha), 1) in: (Tiottery (ha, hg), 1) 
wit: L wit: L wit: L 
out: (Ac.versig,,, (s), 2B)| Jout: (As.versig,, (s),2B)| Jout: (Ac.versig,, (s), 2B) 
absLock: t 


Fig. 11. Transactions of the fair lottery with deposit. 


of these lengths, and reveal it (using Topen) before the absLock deadline, since 
otherwise they will lose their deposits (enabling T pay). 

The bets are put using Tjottery, Whose output script computes the winner 
using the secrets, which can then be revealed. For this, the secret lengths are 
compared: if equal, A wins, otherwise B wins. In this way, the lottery is equiv- 
alent to a coin toss. Note that, if a malicious player chooses a secret having 
another length than 128 or 129, the Tioitery transaction will become stuck, but 
its opponent will be compensated using the deposit. 

The endpoint protocol Pa of player A follows (the one for B is similar): 


Pa = put Tacom{wit + sige?(Tacom)} (ask Teeom as y. P! + 1.Popen) 

PY = let hg = gethasn (Y) in if hg Æ ha then Ppay | P” else Ppay | Popen 

P" — Ble P” + T.P open 

P” = let o = sigg” (Tiottery(ha, hp) in 

(put Tiottery(ha, hg ){wit(1) + o}{wit(2) = x}. (Popen | Pwin)) +T-Popen 

Pray = put Tepag{ witi L sigg. (TBpay)} 
Popen = put TAopen{Wit > sa sigg, (Taopen)} 
Pig: = ak Teom 3E uim 
Paim = put Tawin(ha, hg {wit => sigg, (Tawin(ha, hg)) Sa 9etsecret(Z) } 


Player A starts by putting TAcom on the blockchain, then she waits for B 
doing the same. If B does not cooperate, A can safely abort the protocol taking 
its T.Popen branch, so redeeming her deposit with TAopen (as usual, here with 7 
we are modelling a timeout). If B commits his secret, A executes P’, extracting 
the hash hg of B’s secret, and checking whether it is distinct from h,. If the 
hashes are found to be equal, A aborts the protocol using Popen. Otherwise, A 
runs P” | Pray. The Ppa, component attempts to redeem B’s deposit, as soon 
as the absLock deadline of Tgpay expires, forcing B to timely reveal his secret. 
Instead, P” proceeds with the lottery, asking B for his signature of Tyottery. If B 
does not sign, A aborts using Popen. Then, A runs P”, finally putting the bets 
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Tai Topen (72) Teeaand (h) 
in: (Ta, 1) in: (Tep(h), 1) in: (Tep(h), 1) 
wit: L wit: L wit: L 
i (Axs.(versig,,. (s) and H(a) = h) | Jout: (Ac.versig,, (s), vB) | [out: (As.versig,., (s), vB) 
` or relAfter t : versig,, (s), vB) relLock: t 


Fig. 12. Transactions of the contingent payment. 


(Tiottery) on the ledger. If this is not possible (e.g., because one of the Tye: is 
already spent), A aborts using Popen. After Tiottery is on the ledger, A reveals 
her secret and redeems her deposit with Popen. In parallel, with Pwin she waits 
for the secret of B to be revealed, and then attempts to redeem the pot (Tawin). 

The fairness of this lottery has been established in [8]. This protocol can be 
generalised to N > 2 players [8,9] but in this case the deposit grows quadratically 
with N. The works [14,36] have proposed fair multiparty lotteries that require, 
respectively, zero and constant (> 0) deposit. More precisely, [36] devises two 
variants of the protocol: the first one only relies on SegWit, but requires each 
player to statically sign O(2% ) transactions; the second variant reduces the num- 
ber of signatures to O(N7), at the cost of introducing a custom opcode. Also the 
protocol in [14] assumes an extension of Bitcoin, i.e. the malleability of in fields, 
to obtain an ideal fair lottery with O(N) signatures per player (see Sect. 5). 


4.8 Contingent Payments 


Assume a participant A who wants to pay vB to receive a value s which makes 
a public predicate p true, where p(s) can be verified efficiently. A seller B who 
knows such s is willing to reveal it to A, but only under the guarantee that he 
will be paid vB. Similarly, the buyer wants to pay only if guaranteed to obtain s. 

A naive attempt to implement this contract in Bitcoin is the following: A 
creates a transaction T such that T.out(<, x) evaluates to true if and only if p(x) 
holds and ç is a signature of B. Hence, B can redeem vb from T by revealing s. 
In practice, though, this approach is arguably useful, since it requires coding p 
in the Bitcoin scripting language, whose expressiveness is quite limited. 

More general contingent payment contracts can be obtained by exploiting 
zero-knowledge proofs [13,24,35]. In this setting, the seller generates a fresh key 
k, and sends to the buyer the encryption es = E,(s), together with the hash 
hy, = H(k), and a zero-knowledge proof guaranteeing that such messages have 
the intended form. After verifying this proof, A is sure that B knows a preimage 
k’ of hy, (by collision resistance, k’ = k) such that Dw (es) satisfies the predicate 
p, and so she can buy the preimage k of hy, with the naive protocol, so obtaining 
the solution s by decrypting e, with k. 
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The transactions implementing this contract are displayed in Fig. 12. The 
relAfter clause in Tep allows A to redeem vB if no solution is provided by the 
deadline t. The behaviour of the buyer A can be modelled as follows: 


Pa =B?(e,,hy,z)-P +7 
P = if verify(es, hg, z) then put Tcp(he){wit > sigg’(Tcp(hx))}. P” else 0 
P’ = ask T open (he) as z. P”(D get (œ) (€s)) + 
put Trefund(Ae){wit > L sigh’ (T refuna(he))}) 


Upon receiving es, hk and the proof z” the buyer verifies z. If the verification 
succeeds, A puts Tep(hk) on the blockchain. Then, she waits for Topen, from 
which she can retrieve the key k, and so use the solution Dgez,(2)(és) in P”. In 
this way, B can redeem vB. If B does not put T open, after t time units A can get 
her deposit back through T,efuna. The protocol of B is simple, so it is omitted. 


5 Research Challenges and Perspectives 


Extensions to Bitcoin. The formal model of smart contracts we have proposed 
is based on the current mechanisms of Bitcoin; indeed, this makes it possible to 
translate endpoint protocols into actual implementations interacting with the 
Bitcoin blockchain. However, constraining smart contracts to perfectly adhere 
to Bitcoin greatly reduces their expressiveness. Indeed, the Bitcoin scripting 
language features a very limited set of operations®, and over the years many 
useful (and apparently harmless) opcodes have been disabled without a clear 
understanding of their alleged insecurity®. This is the case e.g., of bitwise logic 
operators, shift operators, integer multiplication, division and modulus. 

For this reason some developers proposed to re-enable some disabled 
opcodes!’, and some works in the literature proposed extensions to the Bitcoin 
scripting language so to enhance the expressiveness of smart contracts. 

A possible extension is covenants [37], a mechanism that allows an output 
script to constrain the structure of the redeeming transaction. This is obtained 
through a new opcode, called CHECKOUTPUTVERIFY, which checks if a given out of 
the redeeming transaction matches a specific pattern. Covenants are also studied 
n [41], where they are implemented using the opcode CAT (currently disabled) 
and a new opcode CHECKSIGFROMSTACK which verifies a signature against an 
arbitrary bitstring on the stack. In both works, covenants can also be recursive, 
e.g. a covenant can check if the redeeming transaction contains itself. Using 
recursive covenants allows to implement a state machine through a sequence of 
transactions that store its state. 


T For simplicity, here we model the zero-knowledge proof as a single message. More 
concretely, it should be modelled as a sub-protocol. 
8 https: //en.bitcoin.it /wiki/Script. 
? https: //en.bitcoin.it /wiki/Common_Vulnerabilities_and_Exposures7##CVE-2010- 
5141. 
10 https: / /lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-May /014356.html. 
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Secure cash distribution with penalties [8, 16,32] is a cryptographic primitive 
which allows a set of participants to make a deposit, and then provide inputs to 
a function whose evaluation determines how the deposits are distributed among 
the participants. This primitive guarantees that dishonest participants (who, 
e.g., abort the protocol after learning the value of the function) will pay a penalty 
to the honest participants. This primitive does not seem to be directly imple- 
mentable in Bitcoin, but it becomes so by extending the scripting language with 
the opcode CHECKSIGFROMSTACK discussed above. Secure cash distribution with 
penalties can be instantiated to a variety of smart contracts, e.g. lotteries [8] 
poker [32], and contingent payments. The latter smart contract can also be 
obtained through the opcode CHECKKEYPAIRVERIFY in [24], which checks if the 
two top elements of the stack are a valid key pair. 

Another new opcode, called MULTIINPUT [36] consumes from the stack a 
signature ø and a sequence of in values (T1, j1)--- (Tn, jn), with the following 
two effects: (i) it verifies the signature o against the redeeming transaction T, 
neglecting T.in; (ii) it requires T.in to be equal to some of the T;. Exploiting this 
opcode, [36] devise a fair N-party lottery which requires zero deposit, and O(N?) 
off-chain signed transaction. The first one of these effects can be alternatively 
obtained by extending, instead of the scripting language, the signature modifiers. 
More specifically, [14] introduces a new signature modifier, which can set to L 
all the inputs of a transaction (i.e., no input is signed). In this way they obtain 
a fair multi-party lottery with similar properties to the one in [36]. 

Another way improve the expressiveness of smart contracts is to replace the 
Bitcoin scripting language, e.g. with the one in [40]. This would also allow to 
establish bounds on the computational resources needed to run scripts. 

Unfortunately, none of the proposed extensions has been yet included in the 
main branch of the Bitcoin Core client, and nothing suggests that they will be 
considered in the near future. Indeed, the development of Bitcoin is extremely 
conservative, as any change to its protocol requires an overwhelming consensus 
of the miners. So far, new opcodes can only be empirically assessed through the 
Elements alpha project!!, a testnet for experimenting new Bitcoin features. A 
significant research challenge would be that of formally proving that new opcodes 
do not introduce vulnerabilities, exploitable e.g. by Denial-of-Service attacks. For 
instance, unconstrained uses of the opcode CAT may cause an exponential space 
blow-up in the verification of transactions. 


Formal Methods for Bitcoin Smart Contracts. As witnessed in Sect. 4, 
designing secure smart contracts on Bitcoin is an error-prone task, simi- 
larly to designing secure cryptographic protocols. The reason lies in the fact 
that, to devise a secure contract, a designer has to anticipate any possible 
(mis-)behaviour of the other participants. The side effect is that endpoint pro- 
tocols may be quite convoluted, as they must include compensations at all the 
points where something can go wrong. Therefore, tools to automate the analysis 
and verification of smart contracts may be of great help. 


11 https: //elementsproject.org/elements/opcodes/. 
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Recent works [7] propose to verify Bitcoin smart contracts by modelling the 
behaviour of participants as timed automata, and then using UPPAAL [15] to 
check properties against an attacker. This approach correctly captures the time 
constraints within the contracts. The downside is that encoding this UPPAAL 
model into an actual implementation with Bitcoin transactions is a complex task. 
Indeed, a designer without a deep knowledge of Bitcoin technicalities is likely 
to produce an UPPAAL model that can not be encoded in Bitcoin. A relevant 
research challenge is to study specification languages for Bitcoin contracts (like 
e.g. the one in Sect. 3), and techniques to automatically encode them in a model 
that can be verified by a model checker. 

Remarkably, the verification of security properties of smart contracts requires 
to deal with non-trivial aspects, like temporal constraints and probabilities. This 
is the case, e.g., for the verification of fairness of lotteries (like e.g. the one 
discussed in Sect. 4.7); a further problem is that fairness must hold against any 
adversarial strategy. It is not clear whether in this case it is sufficient to consider 
a “most powerful” adversary, like e.g. in the symbolic Dolev-Yao model. In case 
a contract is not secure against arbitrary (PTIME) adversaries, one would like 
to verify that, at least, it is secure against rational ones [27], which is a relevant 
research issue. Additional issues arise when considering more concrete models 
of the Bitcoin blockchain, respect to the one in Sect.2. This would require to 
model forks, i.e. the possibility that a recent transaction is removed from the 
blockchain. This could happen with rational (but dishonest) miners [33]. 


DSLs for Smart Contracts. As witnessed in Sect. 4, modelling Bitcoin smart 
contracts is complex and error-prone. A possible way to address this complex- 
ity is to devise high-level domain-specific languages (DSLs) for contracts, to be 
compiled in low-level protocols (e.g., the ones in Sect. 3). Indeed, the recent pro- 
liferation of non-Turing complete DSLs for smart contracts [18,22,26] suggests 
that this is an emerging research direction. 

A first proposal of an high-level language implemented on top of Bitcoin is 
Typecoin [23]. This language allows to model the updates of a state machine as 
affine logic propositions. Users can “run” this machine by putting transactions 
on the Bitcoin blockchain. The security of the blockchain guarantees that only 
the legit updates of the machine can be triggered by users. A downside of this 
approach is that liveness is guaranteed only by assuming cooperation among the 
participants, i.e., a dishonest participant can make the others unable to complete 
an execution. Note instead that the smart contracts in Sect. 4 allow honest par- 
ticipants to terminate, regardless of the behaviours of the environment. In some 
cases, e.g. in the lottery in Sect. 4.7, abandoning the contract may even result in 
penalties (i.e., loss of the deposit paid upfront to stipulate the contract). 
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Abstract. Smart contracts are programs running on cryptocurrency 
(e.g., Ethereum) blockchains, whose popularity stem from the possibility 
to perform financial transactions, such as payments and auctions, in a 
distributed environment without need for any trusted third party. Given 
their financial nature, bugs or vulnerabilities in these programs may 
lead to catastrophic consequences, as witnessed by recent attacks. Unfor- 
tunately, programming smart contracts is a delicate task that requires 
strong expertise: Ethereum smart contracts are written in Solidity, a ded- 
icated language resembling JavaScript, and shipped over the blockchain 
in the EVM bytecode format. In order to rigorously verify the security of 
smart contracts, it is of paramount importance to formalize their seman- 
tics as well as the security properties of interest, in particular at the level 
of the bytecode being executed. 

In this paper, we present the first complete small-step semantics of 
EVM bytecode, which we formalize in the F* proof assistant, obtain- 
ing executable code that we successfully validate against the official 
Ethereum test suite. Furthermore, we formally define for the first time 
a number of central security properties for smart contracts, such as call 
integrity, atomicity, and independence from miner controlled parameters. 
This formalization relies on a combination of hyper- and safety proper- 
ties. Along this work, we identified various mistakes and imprecisions in 
existing semantics and verification tools for Ethereum smart contracts, 
thereby demonstrating once more the importance of rigorous semantic 
foundations for the design of security verification techniques. 


1 Introduction 


One of the determining factors for the growing interest in blockchain technolo- 
gies is the groundbreaking promise of secure distributed computations even in 
absence of trusted third parties. Building on a distributed ledger that keeps 
track of previous transactions and the state of each account, whose functionality 
and security is ensured by a delicate combination of incentives and cryptogra- 
phy, software developers can implement sophisticated distributed, transactions- 
based computations by leveraging the scripting language offered by the underly- 
ing cryptocurrency. While many of these cryptocurrencies have an intentionally 
limited scripting language (e.g., Bitcoin [1]), Ethereum was designed from the 
© The Author(s) 2018 
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ground up with a quasi Turing-complete language’. Ethereum programs, called 
smart contracts, have thus found a variety of appealing use cases, such as finan- 
cial contracts [2], auctions [3], elections [4], data management systems [5], trading 
platforms [6,7], permission management [8] and verifiable cloud computing [9], 
just to mention a few. Given their financial nature, bugs and vulnerabilities in 
smart contracts may lead to catastrophic consequences. For instance, the infa- 
mous DAO vulnerability [10] recently led to a 60M$ financial loss and similar vul- 
nerabilities occur on a regular basis [11,12]. Furthermore, many smart contracts 
in the wild are intentionally fraudulent, as highlighted in a recent survey [13]. 

A rigorous security analysis of smart contracts is thus crucial for the trust of 
the society in blockchain technologies and their widespread deployment. Unfortu- 
nately, this task is a quite challenging for various reasons. First, Ethereum smart 
contracts are developed in an ad-hoc language, called Solidity, which resembles 
JavaScript but features specific transaction-oriented mechanisms and a number 
of non-standard semantic behaviours, as further described in this paper. Second, 
smart contracts are uploaded on the blockchain in the form of Ethereum Vir- 
tual Machine (EVM) bytecode, a stack-based low-level code featuring dynamic 
code creation and invocation and, in general, very little static information, which 
makes it extremely difficult to analyze. 


Related Work. Recognizing the importance of solid semantic foundations for 
smart contracts, the Ethereum foundation published a yellow paper [14] to 
describe the intended behaviour of smart contracts. This semantics, however, 
exhibits several under-specifications and does not follow any standard approach 
for the specification of program semantics, thereby hindering program verifica- 
tion. In order to provide a more precise characterization, Hirai formalizes the 
EVM semantics in the proof assistant Isabelle/HOL and uses it for manually 
proving safety properties for concrete programs [15]. This semantics, however, 
constitutes just a sound over-approximation of the original semantics [14]. More 
specifically, once a contract performs a call that is not a self-call, it is assumed 
that arbitrary code gets executed and consequently arbitrary changes to the 
account’s state and to the global state can be performed. Consequently, this 
semantics can not serve as a general-purpose basis for static analysis techniques 
that might not rely on the same over-approximation. 

In a concurrent, unpublished work, Hildebrandt et al. [16] define the EVM 
semantics in the K framework [17] — a language independent verification frame- 
work based on reachability logics. The authors leverage the power of the K frame- 
work in order to automatically derive analysis tools for the specified semantics, 
presenting as an example a gas analysis tool, a semantic debugger, and a pro- 
gram verifier based on reachability logics. The underlying semantics relies on 
non-standard local rewriting rules on the system configuration. Since parts of 
the execution are treated in separation such as the exception behavior and the 
gas calculations, one small-step consists of several rewriting steps, which makes 


1 While the language itself is Turing complete, computations are associated with a 
bounded computational budget (called gas), which gets consumed by each instruction 
thereby enforcing termination. 
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this semantics harder to use as a basis for new static analysis techniques. This is 
relevant whenever the static analysis tools derivable by the K framework are not 
sufficient for the desired purposes: for instance, their analysis requires the user 
to manually specify loop invariants, which is hardly doable for EVM bytecode 
and clearly does not scale to large programs. Furthermore, all these works con- 
centrate on the semantics of EVM bytecode but do not study security properties 
for smart contracts. 

Sergey and Hobor [18] compare smart contracts on the blockchain with con- 
current objects using shared memory and use this analogy to explain typical 
problems that arise when programming smart contracts in terms of concepts 
known from concurrency theory. They encourage the application of state-of-the 
art verification techniques for concurrent programs to smart contracts, but do 
not describe any specific analysis method applied to smart contracts themselves. 
Mavridou and Laszka [19] define a high-level semantics for smart contracts that 
is based on finite state machines and aims at simplifying the development of 
smart contracts. They provide a translation of their state machine specification 
language to Solidity, a higher-order language for writing Ethereum smart con- 
tracts, and present design patterns that should help users to improve the security 
of their contracts. The translation to Solidity is not backed up by a correctness 
proof and the design patterns are not claimed to provide any security guarantees. 

Bhargavan et al. [20] introduce a framework to analyze Ethereum contracts 
by translation into F*, a functional programming language aimed at program 
verification and equipped with an interactive proof assistant. The translation 
supports only a fragment of the EVM bytecode and does not come with a jus- 
tifying semantic argument. 

Luu et al. have recently presented Oyente [21], a state-of-the-art static anal- 
ysis tool for EVM bytecode that relies on symbolic execution. Oyente comes 
with a semantics of a simplified fragment of the EVM bytecode and, in partic- 
ular, misses several important commands related to contract calls and contract 
creation. Furthermore, it is affected by a major bug related to calls as well as 
several other minor ones which we discovered while formalizing our semantics, 
which is inspired by theirs. Oyente supports a variety of security properties, 
such as transaction order dependency, timestamp dependency, and reentrancy, 
but the security definitions are rather syntactic and described informally. As we 
show in this paper, the lack of solid semantic foundations causes several sources 
of unsoundness in Oyente. 


Our Contributions. This work lays the semantic foundations for Ethereum 
smart contracts. Specifically, we introduce 


— The first complete small-step semantics for EVM bytecode; 

— A formalization in F* of a large fragment of our semantics, which can serve 
as a foundation for verification techniques based on encoding into this lan- 
guage [20] as well as machine-checked proofs for other analysis techniques 
(e.g., [21]). By compiling F* in OCaml, we could successfully validate our 
semantics against the official Ethereum test suite; 
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— The first formal definitions of crucial security properties for smart con- 
tracts, such as call integrity, for which we devise a dedicated proof technique, 
atomicity, and independence from miner controlled parameters. Interestingly 
enough, the formalization of these properties requires hyper-properties, while 
existing static analysis techniques for smart contracts rely on reachability 
properties and syntactic conditions; 

— A collection of examples showing how the syntactic conditions employed 
in current analysis techniques are imprecise and, in several cases, unsound, 
thereby further motivating the need for solid semantic foundations and rig- 
orous security definitions for smart contracts. 


The complete semantics as well as the formalization in F* are publicly avail- 
able [22]. 


Outline. The remainder of this paper is organized as follows. Section 2 briefly 
overviews the Ethereum architecture, Sect.3 introduces the Ethereum seman- 
tics and our formalization in F*, Sect. 4 formally defines various security proper- 
ties for Ethereum smart contracts, and Sect. 5 concludes highlighting interesting 
research directions. 


2 Background on Ethereum 


Ethereum. Ethereum is a cryptographic currency system built on top of a 
blockchain. Similar to Bitcoin, network participants publish transactions to the 
network that are then grouped into blocks by distinct nodes (the so called min- 
ers) and appended to the blockchain using a proof of work (PoW) consensus 
mechanism. The state of the system — that we will also refer to as global state — 
consists of the state of the different accounts populating it. An account can either 
be an external account (belonging to a user of the system) that carries infor- 
mation on its current balance or it can be a contract account that additionally 
obtains persistent storage and the contract’s code. The account’s balances are 
given in the subunit wei of the virtual currency Ether.? 

Transactions can alter the state of the system by either creating new contract 
accounts or by calling an existing account. Calls to external accounts can only 
transfer Ether to this account, but calls to contract accounts additionally execute 
the code associated to the contract. The contract execution might alter the 
storage of the account or might again perform transactions — in this case we talk 
about internal transactions. 

The execution model underlying the execution of contract code is described 
by a virtual state machine, the Ethereum Virtual Machine (EVM). This is quasi 
Turing complete as the otherwise Turing complete execution is restricted by 
the upfront defined resource gas that effectively limits the number of execu- 
tion steps. The originator of the transaction can specify the maximal gas that 
should be spent for the contract execution and also determines the gas prize 


2 One Ether is equivalent to 1015 wei. 
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(the amount of wei to pay for a unit of gas). Upfront, the originator pays for the 
gas limit according to the gas prize and in case of successful contract execution 
that did not spend the whole amount of gas dedicated to it, the originator gets 
reimbursed with gas that is left. The remaining wei paid for the used gas are 
given as a fee to a beneficiary address specified by the miner. 


EVM Bytecode. The code of contracts is written in EVM bytecode — an Assem- 
bler like bytecode language. As the core of the EVM is a stack-based machine, 
the set of instructions in EVM bytecode consists mainly of standard instructions 
for stack operations, arithmetics, jumps and local memory access. The classical 
set of instructions is enriched with an opcode for the SHA3 hash and several 
opcodes for accessing the environment that the contract was called in. In addi- 
tion, there are opcodes for accessing and modifying the storage of the account 
currently running the code and distinct opcodes for performing internal call and 
create transactions. Another instruction particular to the blockchain setting is 
the SELFDESTRUCT code that deletes the currently executed contract - but 
only after the successful execution of the external transaction. 


Gas and Exceptions. The execution of each instruction consumes a positive 
amount of gas. There is a gas limit set by the sender of the transaction. Exceed- 
ing the gas limit results in an exception that reverts the effects of the current 
transaction on the global state. In the case of nested transactions, the occur- 
rence of an exception only reverts its own effects, but not those of the calling 
transaction. Instead, the failure of an internal transaction is only indicated by 
writing zero to the caller’s stack. 


Solidity. In practice, most Ethereum smart contracts are not written in EVM 
bytecode directly, but in the high-level language Solidity which is developed 
by the Ethereum Foundation [23]. For understanding the typical problems that 
arise when writing smart contracts, it is important to consider the design of this 
high-level language. 

Solidity is a so called “contract-oriented” programming language that uses 
the concept of class from object-oriented languages for the representation of con- 
tracts. Similar to classes in object-oriented programming, contracts specify fields 
and methods for contract instances. Fields can be seen as persistent storage of 
a contract (instance) and contract methods can by default be invoked by any 
internal or external transaction. For interacting with another contract one either 
needs to create a new instance of this contract (in which case a new contract 
account with the functionality described in the contract class is created) or one 
can directly make transactions to a known contract address holding a contract of 
the required shape. The syntax of Solidity resembles JavaScript, enriched with 
additional primitives accounting for the distributed setting of Ethereum. In par- 
ticular, Solidity provides primitives for accessing the transaction and the block 
information, like msg.sender for accessing the address of the account invoking the 
method or msg.value for accessing the amount of wei transferred by the transaction 
that invoked the method. 
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Solidity shows some particularities when it comes to transferring money to 
another contract especially using the provided low level functions sena and ca11. A 
value transfer initiated using these functions is finally translated to an internal 
call transaction which implies that calling a contract might also execute code and 
in particular it can fail because the available gas is not sufficient for executing the 
code. In addition — as in the EVM - these kinds of calls do not enable exception 
propagation, so that the caller manually needs to checks for the return result. 
Another special feature of Solidity is that it allows for defining so called fallback 
functions for contracts that get executed when a call via the sena function was 
performed or (using the ca11 function) an address is called that however does not 
properly specifies the concrete function of the contract to be called. 


3 Small-Step Semantics 


We introduce a small-step semantics covering the full EVM bytecode, inspired 
by the one presented by Luu et al. [21], which we substantially revise in order to 
handle the missing instructions, in particular contract calls and call creation. In 
addition, while formalizing our semantics, we found a major flaw related to calls 
and several minor ones (cf. Sect. 3.7), which we fixed and reported to the authors. 
Due to space constraints, we refer the interested reader to the full version of the 
paper [22] for a formal account of the semantic rules and present below the most 
significant ones. 


3.1 Preliminaries 


In the following, we will use B to denote the set {0,1} of bits and accordingly B* 
for sets of bitstrings of size x. We further let N, denote the set of non-negative 
integers representable by x bits and allow for implicit conversion between those 
two representations. In addition, we will use the notation [X] (resp. £(X)) for 
arrays (resp. lists) of elements from the set X. We use standard notations for 
operations on arrays and lists. 


3.2 Global State 


As mentioned before, the global state is a (partial) mapping from account 
addresses (that are bitstrings of size 160) to accounts. In the case that an account 
does not exist, we assume it to map to L. Accounts, irrespectively of their type, 
are tuples of the form (n, b, stor, code), with n € Nos¢ being the account’s nonce 
that is incremented with every other account that the account creates, b € N256 
being the account’s balance in wei, stor € B6 — B256 being the accounts per- 
sistent storage that is represented as a mapping from 256-bit words to 256-bit 
words and finally code € [B8] being the contract that is an array of bytes. In 
contrast to contract accounts, external accounts have the empty bytearray as 
code. As only the execution of code in the context of the account can access 
and modify the account’s storage, the fact that formally external accounts have 
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persistent storage does not have any effect. In the following, we will denote the 
set of addresses with A and the set of global states with X and we will assume 
that o € X. 


3.3 Small-Step Relation 


In order to define the small-step semantics, we give a small-step relation I’ E 
S — S’ that specifies how a call stack S € S representing the state of the 
execution evolves within one step under the transaction environment I" € Teny. 

In Fig. 1 we give a full grammar for call stacks and transaction environments: 


CallstacksS 3.9  := EXC :: Spiain | HALT(o,d, 9,7) :: Sptain | Spiain 
Plain call stacks Spiain D Spain := (H, t, 0, N) 2: Splain 
Machine states M35 (gas, pc,m,%, 8) 


Execution environments J 3.u := (actor, input, sender, value, code) 
Global states Sa 
Account states A acc := (n,b,code, stor) | L 
Transaction effects N 37  := (b, L, S+) 
Transaction environments Tew 3 I := (o, prize, H) 


Notations: de [B°], g €Nos6e, ņnEN, o€ A, prize€ Nəs, HEH 
gas € Nose, pe € Nose, m €B” — B® ic€Nose, s € L(B*) 
sender € A input € [B®] sender € A value € Nəs code € [B*] 
b € Nose L € L(Evig) STOA X=A>A 


Fig. 1. Grammar for call stacks and transaction environments 


Transaction Environments. The transaction environment represents the 
static information of the block that the transaction is executed in and the 
immutable parameters given to the transaction as the gas prize or the gas limit. 
More specifically, the transaction environment I € Teny = A X Nosg x H is a 
tuple of the form (o, prize, H) with o € A being the address of the account that 
made the transaction, prize € N25 denoting amount of wei that needs to paid 
for a unit of gas in this transaction and H € H being the header of the block 
that the transaction is part of. We do not specify the format of block headers 
here, but just assume a set H of block headers. 


Callstacks. A call stack S is a stack of execution states which represents the 
state of the execution within one internal transaction. We give a formal definition 
of the set of possible callstacks S as follows: 
S := {EXC:: Spain, HALT(o, gas, d, N) :: Sptain, Spiain 
|c EZ, gasEN, de [B], 7 EN, Spiain € L(M x Ix X x N)} 


Syntactically, a call stack is a stack of regular execution states of the form 
(u,t,0,7) that can optionally be topped with a halting state HALT(o, gas, d, n) 
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or an exception state EXC. We summarize these three types of states as execu- 
tion states S. Semantically, halting states indicate regular halting of an internal 
transaction, exception states indicate exceptional halting, and regular execu- 
tion states describe the state of internal transactions in progress. Halting and 
exception states can only occur as top elements of the call stack as they represent 
terminated internal transactions. Exception states of the form EXC do not carry 
any information as in the case of an exception all effects of the terminated inter- 
nal transaction are reverted and the caller state therefore stays unaffected, except 
for the gas. Halting states instead are of the form HALT(o, gas, d, n) specifying 
the global state ø the execution halted in, the gas gas € N256 remaining from the 
execution, the return data d € [B8] and the additional transaction effects ņn € N 
of the internal transaction. The additional transaction effects carry information 
that are accumulated during execution, but do not influence the small-step exe- 
cution itself. Formally, the additional transaction effects are a triple of the form 
(b, L, 5+) € N =Nas6 x L(Eviog) x P(A) with b € Nos6 being the refund balance 
that is increased by account storage operations and will finally be paid to the 
transaction’s beneficiary, L € £(Evicog) being the sequence of log events that the 
bytecode execution invoked during execution and 5; C A being the so called 
suicide set — the set of account addresses that executed the SELFDESTRUCT 
command and therefore registered their account for deletion. The information 
held by the halting state is carried over to the calling state. 

The state of a non-terminated internal transaction is described by a regular 
execution state of the form (j1,4,0,7). The state is determined by the current 
global state ø of the system as well as the execution environment v € J that 
specifies the parameters of the current transaction (including inputs and the 
code to be executed), the local state u € M of the stack machine, and the 
transaction effects 7 € N collected during execution so far. 


Execution Environment. The execution environment 1 of an internal trans- 
action specifies the static parameters of the transaction. It is a tuple of the form 
(actor, input, sender, value, code) € I = A x [B8] x A x Nose x [B] with the 
following components: 


— actor € A is the address of the account currently executing; 

— input € [B] is the data given as an input to the internal transaction; 

— sender E€ A is the address of the account that initiated the internal 
transaction; 

— value € No5¢6 is the value transferred by the internal transaction; 

— code € [B8] is the code currently executed. 


This information is determined at the beginning of an internal transaction exe- 
cution and it can be accessed, but not altered during the execution. 


Machine State. The local machine state u represents the state of the under- 
lying state machine used for execution and is a tuple of the form (gas, pc, m, i, s) 
where 
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— gas € Nos¢6 is the current amount of gas still available for execution; 

— pc € Nase is the current program counter; 

— m € BS — B8 is a mapping from 256-bit words to bytes that represents the 
local memory; 

— i € Ngs¢ is the current number of active words in memory; 

— s € £L(B?°*) is the local 256-bit word stack of the stack machine. 


The execution of each internal transaction starts in a fresh machine state, with 
an empty stack, memory initialized to all zeros, and program counter and active 
words in memory set to zero. Only the gas is instantiated with the gas value 
available for the execution. 


3.4 Small-Step Rules 


In the following, we will present a selection of interesting small-step rules in 
order to illustrate the most important features of the semantics. 

For demonstrating the overall design of the semantics, we start with the 
example of the arithmetic expression ADD performing addition of two values on 
the machine stack. Note that as the word size of the stack machine is 256, all 
arithmetic operations are performed modulo 2256, 


t.code |u.pc] = ADD 
u.s=a:b:s -gas > 3 uw’ = uls > (a + b) :: s][pc += [gas —= 3] 


DF (u,t,0,9) = S > (u’,t,0,0) 2 S 


t.code [u.pc] = ADD (\p-s] < 2 V p.gas < 3) 
TE (w,t,0,9) 2: S > EXC: S 


We use a dot notation, in order to access components of the different state 
parameters. We name the components with the variable names introduced for 
these components in the last section written in sans-serif-style. In addition, we 
use the usual notation for updating components: t|c — v| denotes that the 
component c of tuple t is updated with value v. For expressing incremental 
updates in a simpler way, we additionally use the notation t[c += v] to denote 
that the (numerical) component of c is incremented by v and similarly t/c —= v] 
for decrementing a component c of t. 

The execution of the arithmetic instruction ADD only performs local changes 
in the machine state affecting the local stack, the program counter, and the 
gas budget. For deciding upon the correct instruction to execute, the currently 
executed code (that is part of the execution environment) is accessed at the 
position of the current program counter. The cost of an ADD instruction is 
constantly three units of gas that get subtracted from the gas budget in the 
machine state. As every other instruction, ADD can fail due to lacking gas or due 
to underflows on the machine stack. In this case, the exception state is entered 
and the execution of the current internal transaction is terminated. For better 
readability, we use here the slightly sloppy V notation for combining the two 
error cases in one inference rule. 
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A more interesting example of a semantic rule is the one of the CALL instruc- 
tion that initiates an internal call transaction. In the case of calling, several 
corner cases need to be treated which results in several inference rules for this 
case. Here, we only present one rule for illustrating the main functionality. More 
precisely, we present the case in that the account that should be called exists, 
the call stack limit of 1024 is not reached yet, and the account initiating the 
transaction has a sufficiently large balance for sending the specified amount of 
wei to the called account. 


u.code |[u.pc] = CALL L.S = g :: to :: va: io :: 18%: 00: 08 3: 8 
a(to) AL |A| +1 < 1024 a(t.actor).b > va aw = M (M (pi, io, is), 00, os) 
Ccali = Cgascap (va, 1, g, p.gas) c = Chase (va, 1) + Cmem (H.i, aw) + Coal 
u-gas > c a’ = olto > o(to)[b += val) (v.actor > o(v.actor)[b —= val) 
d = p.m [io, io + is — 1] L’ = (Ceai, 0, A£. 0, 0, €) 
u’ = [sender — v.actor|[actor — to|[value > va][input — d][code — o(to).code] 


TE (muon): 3 >on ednnanad 


For performing a call, the parameters to this call need to be specified on the 
machine stack. These are the amount of gas g that should be given as budget to 
the call, the recipient to of the call and the amount va of wei to be transferred 
with the call. In addition, the caller needs to specify the input data that should 
be given to the transaction and the place in memory where the return data of 
the call should be written after successful execution. To this end, the remaining 
arguments specify the offset and size of the memory fragment that input data 
should be read from (determined by io and is) and return data should be written 
to (determined by 00 and os). 

Calculating the cost in terms of gas for the execution is quite complicated in 
the case of CALL as it is influenced by several factors including the arguments 
given to the call and the current machine state. First of all, the gas that should 
be given to the call (here denoted by cca) needs to be determined. This value is 
not necessarily equal to the value g specified on the stack, but also depends on 
the value va transferred by the call and the currently available gas. In addition, 
as the memory needs to be accessed for reading the input value and writing the 
return value, the number of active words in memory might be increased. This 
effect is captured by the memory extension function M. As accessing additional 
words in memory costs gas, this cost needs to be taken into account in the 
overall cost. The costs resulting from an increase in the number of active words 
is calculated by the function Cem. Finally, there is also a base cost charged for 
the call that depends on the value va. As the cost also depends on the specific case 
for calling that is considered, the cost calculation functions receive a flag (here 
1) as arguments. These technical details are spelled out in the full version [22]. 

The call itself then has several effects: First, it transfers the balance from 
the executing state (actor in the execution environment) to the recipient (to). 
To this end, the global state is updated. Here we use a special notation for the 
functional update on the global state using () instead of []. Second, for initializing 
the execution of the initiated internal transaction, a new regular execution state 
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is placed on top of the execution stack. The internal transaction starts in a fresh 
machine state at program counter zero. This means that the initial memory is 
initialized to all zeros and consequently the number of active words in memory is 
zero as well and additionally the initial stack is empty. The gas budget given to 
the internal transaction is Ceai} calculated before. The transaction environment 
of the new call records the call parameters. This includes the sender that is the 
currently executing account actor, the new active account that is now the called 
account to as well as the value va sent and the input data given to the call. To 
this end the input data is extracted from the memory using the offset io and the 
size is. We use an interval notation here to denote that a part of the memory 
is extracted. Finally, the code in the execution environment of the new internal 
transaction is the code of the called account. 

Note that the execution state of the caller stays completely unaffected at this 
stage of the execution. This is a conscious design decision in order to simplify 
the expression of security properties and to make the semantics more suitable 
to abstractions. 

Besides CALL there are two different instructions for initiating internal call 
transactions that implement slight variations of the simple CALL instruction. 
These variations are called CALLCODE and DELEGATECALL, which both allow 
for executing another’s account code in the context of the caller. The difference 
is that in the case of CALLCODE a new internal transaction is started and the 
currently executed account is registered as the sender of this transaction while 
in the case of DELEGATECALL an existing call is really forwarded in the sense 
that the sender and the value of the initiating transaction are propagated to the 
new internal transaction. 

Analogously to the instructions for initiating internal call transactions, there 
is also one instruction CREATE that allows for the creation of a new account. The 
semantics of this instruction is similar to the one of CALL, with the exception 
that a fresh account is created, which gets the specified transferred value, and 
that the input provided to this internal transaction, which is again specified 
in the local memory, is interpreted as the initialization code to be executed in 
order to produce the newly created account’s code as output. In contrast to the 
call transaction, a create transaction does not await a return value, but only an 
indication of success or failure. 

For discussing how to return from an internal transaction, we show the rule 
for returning from a successful internal call transaction. 


v.code [u.pc] = CALL H.S = g :: to :: va :: io 2: 18%: 00%: 08 3: 8 
flag = o(to) =1?0:1 aw = M (M (u.i, io, is), 00, 08) 
Ceall = Cgascap (va, flag, g, p-gas) C= Chase (va, flag) + Cmem (Hi aw) + Ceatt 
uw’ = pli > aul[s > 1 :: s][pc += 1][gas += gas — c][m > p.m[[oo, o0 + s — 1] > dl] 
rE HALT(o', gas, d, n’) : (u, L,0,1) N S = (ut, 0, n) :: S 


Leaving the caller state unchanged at the point of calling has the negative 
side effect that the cost calculation needs to be redone at this point in order 
to determine the new gas value of the caller state. But besides this, the rule is 
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straightforward: the program counter is incremented as usual and the number 
of active words in memory is adjusted as memory accesses for reading the input 
and return data have been made. The gas is decreased, meaning that the overall 
amount of gas c allocated for the execution is subtracted. However, as this cost 
already includes the gas budget given to the internal transaction, the gas gas 
that is left after the execution is refunded again. In addition, the return data 
d is written to the local memory of the caller at the place specified by oo and 
os. Finally, the value one is written to the caller’s stack in order to indicate 
the success of the internal call transaction. As the execution was successful, as 
indicated by the halting state, the global state and the transaction effects of the 
callee are adopted by the caller. 

EVM bytecode offers several instructions for explicitly halting (internal) 
transaction execution. Besides the standard instructions STOP and RETURN, 
there is the SELFDESTRUCT instruction that is very particular to the blockchain 
setting. The STOP instruction causes regular halting of the internal transaction 
without returning data to the caller. In contrast, the RETURN instruction allows 
one to specify the memory fragment containing the return data that will be 
handed to the caller. 

Finally, the SELFDESTRUCT instruction halts the execution and lists the 
currently execution account for later deletion. More precisely, this means that 
this account will be deleted when finalizing the external transaction, but its 
behavior during the ongoing small-step execution is not affected. Additionally, 
the whole balance of the deleted account is transferred to some beneficiary spec- 
ified on the machine stack. 

We show the small-step rules depicting the main functionality of 
SELFDESTRUCT. As for CALL, capturing the whole functionality of 
SELFDESTRUCT would require to consider several corner cases. Here we con- 
sider the case where the beneficiary exists, the stack does not underflow and the 
available amount of gas is sufficient. 


Wy, = SELFDESTRUCT H-S = Aben 3: S 
a = aven mod 2'°° ola) AL u.gas > 5000 g = p.gas — 5000 
o’ = o(t.actor > o(..actor)[balance > 0]} (a — o(a) [balance += o.(v.actor).balance]) 
r = (e.actor € I.S;)?0 : 24000 n = n[S} > 7.S} U {..actor}] [balance += r] 


DF (p,t,0,9) 1 S > HALT(o', g, 0): S 


The SELFDESTRUCT command takes one argument aben from the stack spec- 
ifying the address of the beneficiary that should get the balance of the account 
that is destructed. If all preconditions are satisfied, the balance of the executing 
account (¿.actor) is transferred to the beneficiary address and the current internal 
transaction execution enters a halting state. Additionally, the transaction effects 
are extended by adding v.actor to the suicide set and by possibly increasing the 
refund balance. The refund balance is only increased in case that v.actor is not 
already scheduled for deletion. The halting state captures the global state o after 
the money transfer, the remaining gas g after executing the SELFDESTRUCT 
and the updated transaction effects 7’. As no return data is handed to the caller, 
the empty bytearray e€ is specified as return data in the halting state. 
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Note that SELFDESTRUCT deletes the currently executing account v.actor 
which is not necessarily the same account as the one owning the code t.code. 
This might be due a previous execution of DELEGATECALL or CALLCODE. 


3.5 Transaction Execution 


The outcome of an external transaction execution does not only consist of the 
result of the EVM bytecode execution. Before executing the bytecode, the trans- 
action environment and the execution environment are determined from the 
transaction information and the block header. In the following we assume 7 
to denote the set of transactions. An (external) transaction T € 7, similar 
to the internal transactions, specifies a gas limit, a recipient and a value to 
be transferred. In addition, it also contains the originator and the gas prize 
that will be recorded in the transaction environment. Finally, it specifies an 
input to the transaction and the transaction type that can either be a call or 
a create transaction. The transaction type determines whether the input will 
be interpreted as input data to a call transaction or as initialization code for 
a create transaction. In addition to the transaction of the environment initial- 
ization, some initial changes on the global state and validity checks are per- 
formed. For the sake of presentation we assume in the following a function 
initialize (-,-,-)€ Tx Hx X — (Tenu X S) U {L} performing the initialization 
phase and returning a transaction environment and initial execution state in 
the case of a valid transaction and L otherwise. Similarly, we assume a function 
finalize (-,-,-) ET x S x N x X that given the final global state of the execu- 
tion, the accumulated transaction effects and the transaction, computes the final 
effects on the global state. These include for example the deletion of the contracts 
from the suicide set and the payout to the beneficiary of the transaction. 

Formally we can define the execution of a transaction T € T in a block with 
header H € H as follows: 


(1, s) = initialize (T, H,c) 
PEs:e > se final (s") a’ = finalize (s', n, T) 


where —* denotes the reflexive and transitive closure of the small-step relation 
and the predicate final(-) characterizes a state that cannot be further reduced 
using the small-step relation. 


3.6 Formalization in F* 


We provide a formalization of a large fragment of our small-step semantics in the 
proof assistant F* [24]. At the time of writing, we are formalizing the remaining 
part, which only consists of straightforward local operations, such as bitwise 
operators and opcodes to write code to (resp. read code from) the memory. 
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F* is an ML-dialect that is optimized for program verification and allows for 
performing manual proofs as well as automated proofs leveraging the power of 
SMT solvers. 

Our formalization strictly follows the small-step semantics as presented in 
this paper. The core functionality is implemented by the function step that 
describes how an execution stack evolves within one execution state. To this end 
it has two possible outcomes: either it performs an execution step and returns 
the new callstack or — in the case that a final configuration is reached (which 
is a stack containing only one element that is either a halting or an exception 
state) — it reports the final state. In order to provide a total function for the step 
relation, we needed to introduce a third execution outcome that signalizes that 
a problem occurred due to an inconsistent state. When running the semantics 
from a valid initial configuration this result, however, should never be produced. 
For running the semantics, the function execution is defined that subsequently 
performs execution steps using step until reaching the final state and reports it. 

The current implementation encompasses approximately thousand lines of 
code. Since F* code can be compiled into OCaml, we validate our semantics 
against the official EVM test suite [25]. Our semantics passes 304 out of 624 
tests, failing only in those involving any of the missing functionalities. 

We make the formalization in F* publicly available [22] in order to facili- 
tate the design of static analysis techniques for EVM bytecode as well as their 
soundness proofs. 


3.7 Comparison with the Semantics by Luu et al. [21] 


The small-step semantics defined by Luu et al. [21] encompasses only a variation 
of a subset of EVM bytecode instructions (called EtherLite) and assumes a 
heavily simplified execution configuration. The instructions covered span simple 
stack operations for pushing and popping values, conditional branches, binary 
operations, instructions for accessing and altering local memory and account 
storage, as well as as the ones for calling, returning and destructing the account. 
Essential instructions as CREATE and those for accessing the transaction and 
block information are omitted. The authors represent a configuration as a tuple 
of a call stack of activation records and the global state. An activation record 
contains the code to be executed, the program counter, the local memory and 
the machine stack. The global state is modelled as mapping from addresses to 
accounts, with the latter consisting of code, balance and persistent storage. 
The overall abstraction contains a conceptual flaw, as not including the global 
state in the activation records of the call stack does not allow for modelling 
that, in the case of an exception in the execution of the callee, the global state 
is rolled back to the one of the caller at the point of calling. In addition, the 
model cannot be easily extended with further instructions — such as further call 
instructions or instructions accessing the environment — without major changes 
in the abstraction as a lot of information, e.g., the one captured in our small-step 
semantics in the transaction and the execution environment, are missing. 
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4 Security Definitions 


In the following, we introduce the semantic characterization of the most sig- 
nificant security properties for smart contracts, motivating them with typical 
vulnerabilities recurring in the wild. 

For selecting those properties, we inspected the classification of bugs per- 
formed in [13,21]. To our knowledge, these are the only works published so far 
that aim at systematically summarizing bugs in Ethereum smart contracts. 

For the presented bugs, we synthesized the semantic security properties that 
were violated. In this process we realized that some bugs share the same under- 
lying property violation and that other bugs can not be captured by such generic 
properties — either because they are of a purely syntactic nature or because they 
constitute a derivation from a desired behavior that is particular to a specific 
contract. 


Preliminary Notations. Formally, we represent a contract as a tuple of the 
form (a, code) where a € A denotes the address of the contract and code € [B] 
denotes the contract’s code. We denote the set of contracts by C and assume 
functions address(-) and code(-) that extract the contract address and code 
respectively. 

As we will argue about contracts being called in an arbitrary setting, we 
additionally introduce the notion of reachable configuration. Intuitively, a pair 
(1S) of a transaction environment I and a call stack S is reachable if there 
exists a state s such that S,s are the result of initialize (T, H, o), for some 
transaction T, block header H, a global state a, and S is reachable from s. 


Definition 1 (Reachable Configuration). The pair (T, A) E€ Teny x S is a 
reachable configuration if for some transaction T € T, some block header H € H 
and some global state o E€ A — A of the blockchain it holds that 


(I, s) = initialize (T, H,o) AI E s: € —* S 


In order to give concise security definitions, we further introduce, and assume 
throughout the paper, an annotation to the small step semantics in order to 
highlight the contract c that is currently executed. In the case of initialization 
code being executed, we use L. Specifically, we let 


Sn := {EXC, = Stain, HALT(o, gas, n, d), :: Spiain, Spain 
|o €X, gas€N, de [B®], 7 EN, Spain € L((M x Ix X x N) x C)} 


where cE CU{1} =C. 

Next, we introduce the notion of execution trace for smart contract execution. 
Intuitively, a trace is a sequence of actions. In our setting, the actions to be 
recorded are composed of an opcode, the address of the executing contract, 
and a sequence of arguments to the opcode. We denote the set of actions with 
Act. Accordingly, every small step produces a trace consisting of a single action. 
Again, we lift the resulting trace semantics to multiple execution steps that then 
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produce sequences of actions 7 E€ £(Act). We only report the trace semantics 
definition for the CALL case here, referring to the full version of the paper for 
the details [22]. 


L.code |[u.pc] = CALL 
1 


L.-S = g :: tor: va :: t0 :: ii 00% O81: 8 tee b= bee C=: 


PE (wt,0),2 8 


CALL< (g,to,20,is, 00,08) 


(Heo) tho), S 


We will write 7 eals. to denote the projection of 7 to calls performed by con- 
tract c, i.e., actions of the form CALL.(g, to, va, io, is, 00, os), CREATE. (va, io, is), 
CALLCODE, (g, to, va, io, is, 00, os), and DELEGATECALL,(g, to, io, is, 00, 08). 


4.1 Call Integrity 


Dependency on Attacker Code. One of the most famous bugs of Ethereum’s 
history is the so called DAO bug that led to a loss of 60 million dollars in June 
2016 [10]. This bug is in the literature classified as reentrancy bug [13,21] as the 
affected contract was drained out of money by subsequently reentering it and 
performing transactions to the attacker on behalf of the contract. More gener- 
ally, the problem of this contract was that malicious code was able to affect the 
outgoing money flows of the contract. The cause of such bugs mostly roots in 
the developer’s misunderstanding of the semantics of Solidity’s call primitives. 
In general, calling a contract can invoke two kinds of actions: Transferring Ether 
to the contract’s account or Executing (parts of) a contracts code. In particular, 
the ca11 construct invokes the called contract’s fallback function when no partic- 
ular function of the contract is specified (2). Consequently, the developer may 
expect an atomic value transfer where potentially another contract’s code is exe- 
cuted. For illustrating how to exploit this sort of bug, we consider the following 
contracts: 


1 contract Bob{ 1 contract Mallory{ 

2 bool sent = false; 2 function(){ 

3 function ping( address c){ 3 Bob(msg.sender).ping(this) ;}} 
4 if (!sent) { c.call.value (2) (); 

5 sent = true; }}} 


The function ping of contract Bob sends an amount of 2 wei to the address 
specified in the argument. However, this should only be possible once, which 
is potentially ensured by the sent variable that is set after the successful money 
transfer. Instead, it turns out that invoking the cal1.value function on a contract’s 
address invokes the contract’s fallback function as well. 

Given a second contract Mallory, it is possible to transfer more money than 
the intended 2 wei to the account of mallory. By invoking Bob’s function ping with 
the address of Maliory’s account, 2 wei are transferred to Mallory’s account and 
additionally the fallback function of mallory is invoked. As the fallback function 
again calls the ping function with matlory’s address another 2 wei are transferred 
before the variable sent of contract Bob was set. This looping goes on until all gas 
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of the initial call is consumed or the callstack limit is reached. In this case, only 
the last transfer of wei is reverted and the effects of all former calls stay in place. 
Consequently the intended restriction on contract Bob’s ping function (namely to 
only transfer 2 wei once) is circumvented. 


Call Integrity. In order to protect from this class of bugs, it is crucial to 
secure the code against being reentered before regaining control over the control 
flow. From a security perspective, the fundamental problem is that the contract 
behaviour depends on untrusted code, even though this was not intended by 
the developer. We capture this intuition through a hyperproperty, which we 
name call integrity. The idea is that no matter how the attacker can schedule 
c (callstacks S and S” in the definition), the calls of c (traces 7, 7’) cannot be 
controlled by the attacker, even if c hands over the control to the attacker. 


Definition 2 (Call Integrity). A contract c € C satisfies call integrity for a set 
of addresses Ac C A if for all reachable configurations (I, se 1: S), (T, s'e = S”) 
with s,s’ differing only in the code with address in Ac, it holds that for all t,t’ 


PEs: S ae te S A final(t.) A PE Se: S S tens’ A final(te) 


= T Lealls. = a Teati 


4.2 Proof Technique for Call Integrity 


We now establish a proof technique for call integrity, based on local properties 
that are arguably easier to verify and that we show to imply call integrity. As 
a first observation, we identify the different ways in which external contracts 
can influence the execution of a smart contract c and introduce corresponding 
security properties: 


Code Dependency. The contract c might access (information on) the 
untrusted contracts code via the EXTCODECOPY or the EXTCODESIZE 
instructions and make his behaviour depend on those values; 

Effect Dependency. The contract c might call the untrusted contract and 
might depend on its execution effects and return value; 

Re-entrancy. The contract c might call the untrusted contract, with the lat- 
ter influencing the behaviour of the former by performing changes to the 
global state itself or “on behalf” of c by reentering it and thereby potentially 
decreasing the balance of c. 


The first two of these properties can be seen as value dependencies and there- 
fore can be formalized as hyperproperties. The first property says that the calls 
performed by a contract should not be affected by the effects on the execution 
state produced by adversarial contracts. Technically, we consider a contract c 
calling an adversarial contract c’ (captured as I F se: S > w : Se: S in the 
premise), which we let terminate in two arbitrary states s’, t": we require that 
c’s continuation code performs the same calls in both states. 
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Definition 3 (Ac-effect Independence). A contract c € C is Ac-effect 
independent of for a set of addresses Ac C A if for all reachable configu- 
rations (Is. :: S) such that T E se: S > "v: 8: S for some s” and 
address (c') E€ Ac, tt holds that for all final states s’,t’ whose global state might 
differ in all components but the code from the global state of s, 


Dinit F Se i Se S z,* s'e S A final(s") 
* 


A Dmna Ete sens 5 teu S A final (t”) 


> T leais = n Lcalls. 


The second property says that the calls of a contract should not be affected 
by the code read from the blockchain (e.g., the code does not branch on code read 


from the blockchain). To this end we introduce the notation I F s :: $ 2" fs S 


to denote that the local small-step execution of state s on stack S under T results 
in several steps in state s’ producing trace 7 given that in the local execution 
steps of EXTCODECOPY and EXTCODESIZE, which are the operations used 
to access the code on the global state, the code returned by these functions is 
determined by the partial function f € A — [B] as opposed to the global state. In 
other words, we consider in the premise a contract c reading two different codes 


from the blockchain and terminating in both runs (captured as I F se: S ~ 
7 * 


SeS andl F se: S$ T s”e:: S), and we require that c performs the same 
calls in both runs. 
Definition 4 (Ac-code Independence). A contract c € C is Ac-code inde- 


pendent for a set of addresses Ac C A if for all reachable configurations 
(I, se :: S) it holds for all local code updates f, f! € A — [B] on Ac that 


* 


Tes: S ra sou S A final(s’) A Db se: 8 F s'en S A final(s") 
= T Ļeallse = n Lcalls. 


Both these independence properties can be overapproximated by static anal- 
ysis techniques based on program dependence graphs [26], as done by Joana to 
verify non-interference in Java [27]. The idea is to traverse the dependence graph 
in order to detect dependencies between the sensitive sources, in our case the 
data controlled by the adversary and returned to the contract, and the observable 
sinks, in our case the local contract calls. 

The last property constitutes a safety property. Specifically, single-entrancy 
states that it cannot happen that when reentering the contract c another call 
is performed before returning (i.e., after reentrancy, which we capture in the 
call stack as two distinct states with the same running contract c, the call stack 
cannot further increase). 
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Definition 5 (Single-entrancy). A contract c € C is single-entrant if for all 
reachable configurations (I, 8. :: S), it holds for all s', s”, S” that 


TEs S 3 sor VM +4+5e2 89 
=> ads" Ee SA ECL. PES et S HHen 8 Os ease JS HHen 8 


This safety property can be easily overapproximated by syntactic conditions, as 
for instance done in the Oyente analyzer [21]. 

Finally, the next theorem proves the soundness of our proof technique, i.e., 
the two independence properties and the single-entrancy property together entail 
call integrity. 


Theorem 1. Let c € C be a contract and Ac C A be a set of untrusted 
addresses. If c is Ag-local independent, c is Ac-effect independent, and c is 
single-entrant then c provides call integrity for Ac. 


Proof Sketch. Let (T, se :: S), (T, s'e :: S’) be reachable configurations such that 
s,s’ differ only in the code with address in Ag. We now compare the two small- 
step runs of those configurations. Due to Ag-code independence, the execution 
until the first call to an address a € Ac produces the same partial trace until 
the call to a. Indeed, we can express the runs under different address mappings 
through the code update from the Ag-code independence property, as long as no 
call to one of the updated addresses is performed. When a first call to a € Ac 
is performed, we know due to single-entrancy that the following call cannot 
produce any partial execution trace for any of the runs as this would imply that 
contract c is reentered and a call out of the contract is performed. Due to Ac- 
code independence and Ac-effect independence , the traces after returning must 
coincide till the next call to an address in Ag. This argument can be iteratively 
applied until reaching the final state of the execution of c. 


4.3 Atomicity 


Exception Handling. As discussed in Sect.2, the way exceptions are prop- 
agated varies with the way contracts are called. In particular, in the case of 
call and send, exceptions are not propagated, but a manual check for the suc- 
cessful completion of the called function’s execution is required. This behavior 
reflects the way exceptions are reported during bytecode execution: Instead of 
propagating up through the call stack, the callee reports the exception to the 
caller by writing zero to the stack. In the context of Ethereum, the issue of 
exception handling is particularly delicate as due to the gas restriction, it might 
always happen that a call fails simply because it ran out of gas. Intuitively, a 
user would expect a contract not to depend on the concrete gas value that is 
given to it, with the exception that a contract might always fail completely (and 
consequently does not perform any changes on the global state). Such a behavior 
would prevent contracts from entering an inconsistent state as the one presented 
in the following excerpt of a simple banking contract: 
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1 contract SimpleBank{mapping( address => uint) balances; 
2 function withdraw(){ msg.sender.send(balances[msg.sender])); 
3 balances [msg.sender] = 0;}} 


The contract keeps a record of the user balances and provides a function 
that allows a user to withdraw its own balance — which results in an update 
of the record. A developer might not expect that the sena might fail, but as it 
is on the bytecode level represented by a CALL instruction, additional to the 
Ether transfer, code might be executed that runs out of gas. As a consequence, 
the contract would end up in a state where the money was not transferred (as 
all effects of the call are reverted in case of an exception), but still the internal 
balance record of the contract was updated and consequently the money cannot 
be withdrawn by the owner anymore. 

Inspired by such situations where an inconsistent state is entered by a con- 
tract due to mishandled gas exceptions, we introduce the notion of atomicity 
of a contract. Intuitively, atomicity requires that the effects of the execution on 
the global state do not depend on the amount of gas available — except when an 
exception is triggered, in which case the overall execution should have no effect 
at all. The last condition is captured by requiring that the final global state is 
the same as the initial one for at least one of the two executions (intuitively, the 
one causing the exception). 


Definition 6. A contract c € C satisfies atomicity if for all reachable configu- 
rations (T, S') such that LE S — se :: S, it holds for all gas values g,g' E€ Nose 
that 


A 


I E s.[u.gas > g): S —>* s'e: S A^ final(s’) 
A TE se|u.gas—> g]: S —>* s"en S A final(s") 


1 1 1 1 
=> S.O0=S OVSAO=SOVSC=S8 CO 


4.4 Independence of Miner Controlled Parameters 


Another particularity of the distributed blockchain environment is that users 
while performing transactions cannot make assumptions on large parts of the 
context their transaction will be executed in. A part of this is due to the asyn- 
chronous nature of the system: it can always be that another transaction that 
alters the context was performed first. Actually, the situation is even more del- 
icate as transactions are not processed in a first-come-first-serve manner, but 
miners have a big influence on the execution context of transactions. They can 
decide upon the order of the transactions in a block (and also sneak their own 
transactions in first) and in addition they can even control some parameters 
as the block timestamp within a certain range. Consequently, contracts whose 
(outgoing) money flows depend either on miner controlled block information or 
on state information (as the state of their storage or their balance) that might 
be changed by other transactions are prone to manipulations by miners. A typ- 
ical example adduced in the literature is the use of block timestamps as source 
of randomness [13,21]. In a classical lottery implementation that randomly pays 
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out to one of the participants and uses the block timestamp as source of random- 
ness, a malicious miner can easily influence the result in his favor by selecting a 
beneficial timestamp. 

We capture the absence of the miner’s influence by two definitions, one saying 
that the outgoing Ether flows of a contract should not be influenced by compo- 
nents of the transaction environment that can be (within a certain range) set 
by miners and the other one saying that the Ether flows should not depend on 
those parts of the contract state that might have been influenced by previously 
executed transactions. The first definition rules out what is in the literature often 
described as timestamp dependency [13,21]. 

First, we define independence of (parts of) the transaction environment. To 
this end, we assume Cr to be the set of components of the transaction environ- 
ment and write I’ =/,,, I” to denote that the transaction environments I, I” 
are equal up to component cr. 


Definition 7 (Independence of the Transaction Environment). A con- 
tract c E€ C is independent of a subset I C Cr of components of the transaction 
environment if for allcr € I and all reachable configurations (T, se :: S) tt holds 
for all T” that 


er(l) # er( T) AT =je, rT’ 


ATF se: S a seS A final(s') A T'E se: 8 — sens A final(s”) 


=> T lais= nT Jeste 


Next, we define the notion of independence of the account state. Formally, we 
capture this property by requiring that the outgoing Ether flows of the contract 
under consideration should not be affected by those parameters of the contract 
that might have been changed by previous executions which are the balance, the 
account’s nonce, and the account’s persistent storage. 


Definition 8 (Independence of Mutable Account State). A contract c € 
C is independent of the account state if for all reachable configurations (I, se :: 
S), (T,se :: S") with s,s’ differing only in the nonce, balance and storage for 
address (c), it holds that 

TEs: S ay sou S A final(s’.) A TE se: S a, sven S A final(s"e) 


= T Leai a leais: 


As far the other independence properties, both these properties can be stat- 
ically verified using program dependence graphs. 
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4.5 Classification of Bugs 


The previously presented security definitions are motivated by the bugs that 
were observed in real Ethereum smart contracts and studied in [13,21]. Table 1 
gives an overview on the bugs from the literature that are ruled out by our 
security properties. 


Table 1. Bugs from [13,21] ruled out by the security properties 


Security property Bug 
Call integrity Reentrancy [13,21] 

Call to the unknown [13] 
Atomicity Mishandled exceptions [13,21] 


Independence of mutable account state | Transaction order dependency [21] 
Unpredictable state [13] 


Independence of transaction environment | Timestamp dependancy [21] 


Time constraints [13] 


Generating randomness [13] 


Our security properties do not cover all bugs described by Atzei et al. [13], 
as some of the bugs do not constitute violations of general security properties, 
i.e., properties that are not specific to the particular contract implementation. 
There are two classes of bugs that we do not consider: The first class deals 
with the occurrence of unexpected exceptions (such as the Gasless Send and 
the Call stack Limit bug) and the second class encompasses bugs caused by 
the Solidity semantics deviating from the programmer’s intuitions (such as the 
Keeping Secrets, Type Cast and Exception Disorders bugs). 

The first class of bugs encompasses runtime exceptions that are hard to 
predict for the developer and that are consequently not handled correctly. Of 
course, it would be possible to formalize the absence of those particular kinds 
of exceptions as simple reachability properties using the small-step semantics. 
Still, such properties would not give any insight about the security of a contract: 
the fact that a particular exception occurs can be unproblematic in the case 
that proper exception handling is in place. In general, the notion of a correct 
exception handling highly depends on the specific contract’s intended behavior. 
For the special case of out-of-gas exceptions, we could introduce the notion of 
atomicity in order to capture a generic goal of proper exception handling. But 
such a notion is not necessarily sufficient for characterizing reasonable ways of 
dealing with other kinds of runtime exceptions. 

The second class of bugs are introduced on the Solidity level and are similarly 
hard to account for by using generic security properties. Even though these 
bugs might all originate from similar idiosyncrasies of the Solidity semantics, 
the impact of the bugs on the contract’s semantics might deviate a lot. This 
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might result in violations of the security properties discussed before, but also 
in violating the contract’s functional correctness. Consequently, catching those 
bugs might require the introduction of contract-specific correctness properties. 

Finally, Atzei et al. [13] discuss the Ether Lost in Transfer bug. This bug is 
introduced by sending Ether to addresses that do not belong to any contract 
or user, so called orphan addresses. We could easily formalize a reachability 
property stating that no valid contract execution should ever send Ether to 
such an address. We omit such a definition here as it is quite straightforward 
and at the same time it is not a property that directly affects the security of 
an individual contract: Sending Ether to such an orphan address might have 
negative impacts on the overall system as money is effectively lost. For the 
specific contract sending this money, this bug can be seen as a corner case of 
sending Ether to an unintended address which rather constitutes a correctness 
violation. 


4.6 Discussion 


As previously discussed, we are not aware of any prior formal security definitions 
of smart contracts. Nevertheless, we compared our definitions with the verifica- 
tion conditions used in Oyente [21]. Our investigation shows that the verification 
conditions adopted in this tool are neither sound nor complete. 

For detecting mishandled exceptions, it is checked whether each CALL 
instruction in the contract code is directly followed by the ISZERO instruction 
that checks whether the top element of the stack is zero. Unfortunately, Oyente 
(although stated in the paper) does not implement this check, so that we needed 
to manually inspect the bytecodes for determining the outcomes of the syntactic 
check. As shown in Fig. 2a a check for the caller returning zero does not neces- 
sarily imply a proper exception handling and therefore atomicity of the contract. 
This excerpt of a simple banking contract that keeps track of the users’ balances 
and allows users to withdraw their balances using the function withdraw checks 
for the success of the performed call, but still does not react accordingly. It only 
makes sure that the number of successes is updated consistently, but does not 
perform the update on the user’s balance record according to the call outcome. 

On the other hand, not performing the desired check does not imply the 
absence of atomicity as illustrated in Fig. 2b. Writing the outcome in some vari- 
able before checking it, satisfies the negative pattern, but still correct excep- 
tion handling is performed. For detecting timestamp dependency, Oyente checks 
whether the contract has a symbolic execution path with the timestamp (that 
is represented as own symbolic variable) being included in one of its constraints. 
This definition however, does not capture the case shown in Fig. 2c. 

This contract is clearly timestamp dependent as whether or not the function 
pay pays out some money to the sender depends on the timestamp set when 
creating the contract. A malicious miner could consequently manipulate the 
block timestamp for a transaction that creates such a contract in a way that 
money is paid out and then subsequently query it for draining it out. This is 
however, not captured by the characterization of the property in Oyente as they 
only capture the local execution paths of the contract. 
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contract SimpleBank{ 


1 
2 mapping( address => uint) bal; 1 contract SimpleBank{ 
3 uint successes; 2 mapping( address => uint) bal; 
4 function withdraw() { 3 function withdraw() { 
5 if (msg.sender.send(bal[msg.sender]) ) 4 bool b = 
6 { successest++; } 5 msg.sender.send(bal[msg.sender]); 
7 bal[msg.sender] = 0;}} 6 if (b) bal[msg.sender] = 0;}} 
(a) (b) 
1 contract Test{ 1 contract Test { 
2 uint time = block.timestamp; 2 function pay () { 
3 function pay (){ 3 if (block.timestamp % 2 == 1 || 
4 if (time % 2 == 1){ 4 block.timestamp % 2 == 0){ 
5 msg.sender.send(100);}}} 5 msg.sender.send(100);}}} 
(c) (d) 
1 contract Bob{ 
1 contract Fund{ 2 bool sent = false; 
2 mapping( address => uint) shares; 3 function ping( address c) { 
3 function withdraw () { 4 if ('sent) { 
4 if (msg.sender.send(shares[msg.sender] ) ) 5 sent = true; 
5 shares[msg.sender] = 0;}} 6 c.call.value(2) ();}}} 
(e) (f) 


Fig. 2. (a) Exception handling: false negative (b) Exception handling: false positive 
(c) Timestamp dependency: false negative (d) Timestamp dependency: false positive 
(e) Reentrancy: false negative (f) Reentrancy: false positive 


On the other hand, using the block timestamp in path constraints does not 
imply a dependency as can easily be seen by the example in Fig. 2d. 

For the transaction order dependency and the reentrancy property, we were 
unfortunately not able to reconcile the property characterization provided in the 
paper with the implementation of Oyente. 

For checking reentrancy according to the paper, it should be checked whether 
the constraints on the path leading to a CALL instruction can still be satisfied 
after performing the updates on the path (e.g. changing the storage). If so, the 
contract is flagged as reentrant. According to our understanding, this approach 
should not flag contracts that correctly guard their calls as reentrant. Still, by 
the version of Oyente provided with the paper the contract in Fig. 2f is tagged 
as reentrant. 

There exists an updated version of Oyente [28] that is able to precisely tag this 
contract as not reentrant, but we could not find any concrete information on the 
criteria used for checking this property. Still, we found out that the underlying 
characterization can not be sufficient for detecting reentrancy as the contract in 
Fig. 2e is classified not to exhibit a reentrancy vulnerability even though it should 
as the sena command also executes the recipient’s callback function (even though 
with limited gas). The example is taken from the Solidity documentation [23] 
where it is listed as negative example. For transaction order dependency, Oyente 
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should check whether execution traces exhibiting different Ether flows exists. 
But it turned out that not even a simple example of a transaction dependent 
contract can be detected by any of the versions of Oyente. 


5 Conclusions 


We presented the first complete small-step semantics of EVM bytecode and for- 
malized a large fragment thereof in the F* proof assistant, successfully validating 
it against the official Ethereum test suite. We further defined for the first time a 
number of salient security properties for smart contracts, relying on a combina- 
tion of hyper- and safety properties. Our framework is available to the academic 
community in order to facilitate future research on rigorous security analysis of 
smart contracts. 

In particular, this work opens up a number of interesting research directions. 
First, it would be interesting to formalize in F* the semantics of Solidity code 
and a compiler from Solidity into EVM, formally proving its soundness against 
our semantics. This would allow us to provide software developers with a tool 
to verify the security of their code, from which they could obtain bytecode that 
is secure by construction. Second, we intend to design an efficient static analysis 
technique for EVM bytecode and to formally prove its soundness against our 
semantics. 
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Abstract. Blockchain-based distributed computing platforms enable 
the trusted execution of computation—defined in the form of smart con- 
tracts—without trusted agents. Smart contracts are envisioned to have 
a variety of applications, ranging from financial to IoT asset tracking. 
Unfortunately, the development of smart contracts has proven to be 
extremely error prone. In practice, contracts are riddled with security 
vulnerabilities comprising a critical issue since bugs are by design non- 
fixable and contracts may handle financial assets of significant value. To 
facilitate the development of secure smart contracts, we have created 
the FSolidM framework, which allows developers to define contracts as 
finite state machines (FSMs) with rigorous and clear semantics. FSolidM 
provides an easy-to-use graphical editor for specifying FSMs, a code gen- 
erator for creating Ethereum smart contracts, and a set of plugins that 
developers may add to their FSMs to enhance security and functionality. 


Keywords: Smart contract - Security - Finite state machine 
Ethereum - Solidity - Automatic code generation - Design patterns 


1 Introduction 


In recent years, blockchains have seen wide adoption. For instance, the mar- 
ket capitalization of Bitcoin, the leading blockchain-based cryptocurrency, has 
grown from $15 billion to more than $100 billion in 2017. The goal of the first 
generation of blockchains was only to provide cryptocurrencies and payment sys- 
tems. In contrast, more recent blockchains, such as Ethereum, strive to provide 
distributed computing platforms [1,2]. Blockchain-based distributed computing 
platforms enable the trusted execution of general purpose computation, imple- 
mented in the form of smart contracts, without any trusted parties. Blockchains 
and smart contracts are envisioned to have a variety of applications, ranging from 
finance to IoT asset tracking [3]. As a result, they are embraced by an increasing 
number of organizations and companies, including major IT and financial firms, 
such as Cisco, IBM, Wells Fargo, and J.P. Morgan [4]. 
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However, the development of smart contracts has proven to be extremely 
error prone in practice. Recently, an automated analysis of a large sample of 
smart contracts from the Ethereum blockchain found that more than 43% of 
contracts have security issues [5]. These issues often result in security vulnera- 
bilities, which may be exploited by cyber-criminals to steal cryptocurrencies and 
other digital assets. For instance, in 2016, $50 million worth of cryptocurrencies 
were stolen in the infamous “The DAO” attack, which exploited a combination 
of smart-contract vulnerabilities [6]. In addition to theft, malicious attackers 
may also be able to cause damage by leading a smart contract into a deadlock, 
which prevents account holders from spending or withdrawing their own assets. 

The prevalence of smart-contract vulnerabilities poses a severe problem in 
practice due to multiple reasons. First, smart contracts handle assets of signifi- 
cant financial value: at the time of writing, contracts deployed on the Ethereum 
blockchain together hold more than $6 billion worth of cryptocurrency. Second, 
it is by design impossible to fix bugs in a contract (or change its functionality in 
any way) once the contract has been deployed. Third, due to the “code is law” 
principle [7], it is also by design impossible to remove a faulty or malicious trans- 
action from the blockchain, which means that it is often impossible to recover 
from a security incident.! 

Previous work focused on alleviating security issues in existing smart con- 
tracts by providing tools for verifying correctness [7] and for identifying com- 
mon vulnerabilities [5]. In contrast, we take a different approach by developing a 
framework, called FSolidM [9], which helps developers to create smart contracts 
that are secure by design. The main features of our framework are as follows. 


Formal Model: One of the key factors contributing to the prevalence of secu- 
rity issues is the semantic gap between the developers’ assumptions about the 
underlying execution semantics and the actual semantics of smart contracts [5]. 
To close this semantic gap, FSolidM is based on a simple, formal, finite-state 
machine (FSM) based model for smart contracts, which we introduced in [9]. 
The model was designed to support Ethereum smart contracts, but it could 
easily be extended to other platforms. 


Graphical Editor: To further decrease the semantic gap and facilitate develop- 
ment, FSolidM provides an easy-to-use graphical editor that enables developers 
to design smart contracts as FSMs. 


Code Generator: FSolidM provides a tool for translating FSMs into Solidity, 
the most widely used high-level language for developing Ethereum contracts. 
Solidity code can be translated into Ethereum Virtual Machine bytecode, which 
can be deployed and executed on the platform. 


Plugins: FSolidM enables extending the functionality of FSM based smart con- 
tract using plugins. As part of our framework, we provide a set of plugins that 
address common security issues and implement common design patterns, which 


1 It is possible to remove a transaction or hard fork the blockchain if the stakeholders 
reach a consensus; however, this undermines the trustworthiness of the platform [8]. 
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Table 1. Common smart-contract vulnerabilities and design patterns 


Type Common name FSolidM plugin 

Vulnerabilities Reentrancy [5,10] Locking 
Transaction ordering [5,10] | Transition counter 

Patterns Time constraint [11] Timed transitions 
Authorization [11] Access control 


were identified by prior work [5,10,11]. In Table 1, we list these vulnerabilities 
and patterns with the corresponding plugins. 


Open Source: FSolidM is open-source and available online (see Sect. 3). 

The advantages of our framework, which helps developers to create secure con- 
tracts instead of trying to fix existing ones, are threefold. First, we decrease the 
semantic gap and eliminate the issues arising from it by providing a formal model 
and an easy-to-use graphical editor. Second, since the process is rooted in rigor- 
ous semantics, our framework may be connected to formal analysis tools [12,13]. 
Third, the code generator and plugins enable developers to implement smart con- 
tracts with minimal amount of error-prone manual coding. 

The rest of this paper is organized as follows. In Sect.2, we present blind 
auction as a motivating example, which we implement as an FSM-based smart 
contract. In Sect. 3, we describe our FSolidM tool and its built-in plugins. Finally, 
in Sect. 4, we offer concluding remarks and outline future work. 


2 Defining Smart Contracts as FSMs 


Consider as an example a blind auction (similar to the one presented in [14]), in 
which a bidder does not send her actual bid but only a hash of it (i.e., a blinded 
bid). A bidder is required to make a deposit—which does not need to be equal 
to her actual bid—to prevent her from not paying after she has won the auction. 
A deposit is considered valid if its value is higher than or equal to the actual 
bid. A blind auction has four main states: 


1. AcceptingBlindedBids: blinded bids and deposits may be submitted; 

2. RevealingBids: bidders may reveal their bids (i.e., they can send their actual 
bids and the contract checks if the hash value is the same as the one submitted 
in the previous state and if they made sufficient deposit); 

3. Finished: the highest bid wins the auction; bidders can withdraw their 
deposits except for the winner, who can withdraw only the difference between 
her deposit and bid; 

4. Canceled: bidders can retract bids and withdraw their deposits. 


Since smart contracts have states (e.g., AcceptingBlindedBids) and provide 
functions that allow other entities (e.g., contracts or users) to invoke actions 
that change the current state of a contract, they can be naturally represented as 
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close 


bid & ABB [now > creationTime + 5 days] 


cancelABB 


reveal 
[values.length == secret.length] 


4O 


finish 
[now >= creationTime + 10 days] 


cancelRB 


unbid 
Fig. 1. Example FSM for blinded auctions. 


FSMs [15]. An FSM has a finite set of states and a finite set of transitions between 
these states. A transition forces a contract to take a set of actions if the associated 
conditions, i.e., the guards of the transition, are satisfied. Since such states and 
transitions have intuitive meaning for developers, representing contracts as FSMs 
provides an adequate level of abstraction for behavior reasoning. 

Figurel presents the blind auction example in the form of an FSM. 
For simplicity, we have abbreviated AcceptingBlindedBids, RevealingBids, 
Finished, and Canceled to ABB, RB, F, and C, respectively. ABB is the initial 
state of the FSM. Each transition (e.g., bid, reveal, cancel) is associated to a 
set of actions that a user can perform during the blind auction. For instance, a 
bidder can execute the bid transition at the ABB state to send a blind bid and a 
deposit value. Similarly, a user can execute the close transition, which signals 
the end of the bidding period, if the associated guard now >= creationTime 
+ 5 days evaluates to true. To differentiate transition names from guards, we 
use square brackets for the latter. A bidder can reveal her bids by executing 
the reveal transition. The finish transition signals the completion of the auc- 
tion, while the cancelABB and cancelRB transitions signal the cancellation of 
the auction. Finally, the unbid and withdraw transitions can be executed by 
the bidders to withdraw their deposits. For ease of presentation, we omit from 
Fig. 1 the actions that correspond to each transition. For instance, during the 
execution of the withdraw transition, the following action is performed amount 
= pendingReturns [msg.sender]. 


3 The FSolidM Tool 


FSolidM is an open-source”, web-based tool that is built on top of WebGME [16]. 
FSolidM enables collaboration between multiple users during the development 
of smart contracts. Changes in FSolidM are committed and versioned, which 
enables branching, merging, and viewing the history of a contract. We present 
the FSolidM tool in more detail in [17]. 

To generate the Solidity code of a smart contract using FSolidM, a user must 
follow three steps: (1) specify the smart contract in the form of the FSM by using 
the dedicated graphical editor of FSolidM; (2) specify attributes of the smart 


? https: //github.com/anmavrid/smart-contracts. 
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AddSecurityPatterns&t 
SolidityCodeGeneratorst 


1x 


close 


[now > creationTime + 5 days] 
eG 


reveal 
[values.length 


struct Bid { 
cancotABG kih bytes32 blindedBid; 
[now >= creationTime + 1 uint deposit; 
} 
mapping(address => Bid[]) bids; 
mapping(address => uint) pendingReturns; 
ithdı 
me meee address highestBidder; 


Fig. 2. The FSolidM model and code editors. 


contract such as variable definition, statements, etc. in the Property Editor or 
in the dedicated Solidity code editor of FSolidM; (3) optionally apply security 
patterns and functionality extensions, and finally, generate the Solidity code. 
Figure 2 shows the graphical and code editors of the tool (for steps 1 and 2) and 
the list of services (i.e., AddSecurityPatterns and SolidityCodeGenerator for 
step 3) that are provided by FSolidM. We have integrated a Solidity parser? to 
check the syntax of the Solidity code that is given as input by the users. 

Notice that in Fig. 2, parts of the code shown in the code editor are darker 
(lines 1-10) than other parts (lines 12-15). The darker lines of code include code 
that was generated from the FSM model defined in the graphical editor and are 
locked—cannot be altered in the code editor. The non-dark parts indicate code 
that was directly specified in the code editor. 

FSolidM provides mechanisms for checking if the FSM is correctly specified 
(e.g., whether an initial state exists or not). FSolidM notifies developers of errors 
and provides links to the erroneous nodes of the model (e.g., a transition or a 
guard). Through the SolidityCodeEditor service, FSolidM provides an FSM- 
to-Solidity code generator. Additionally, through the AddSecurityPatterns ser- 
vice, FSolidM enables developers to enhance the functionality and security of 
contracts conveniently by adding plugins to them. Our framework provides four 
built-in plugins: locking, transition counter, timed transitions, and access con- 
trol. Plugins can be simply added with a “click,” as shown in Fig. 3. 


Locking: When an Ethereum contract calls a function of another contract, 
the caller has to wait for the call to finish. This allows the callee—who may 
be malicious—to exploit the intermediate state of the caller, e.g., by invoking 
a function of the caller. This re-entrancy issue is one of the most well-known 
vulnerabilities, which was also exploited in the infamous “The DAO” attack. 
To prevent re-entrancy, we provide a security plugin for locking the smart 
contract. Locking eliminates re-entrancy vulnerabilities in a “foolproof” manner: 
functions within the contract cannot be nested within each other in any way. 


3 https: //github.com/ConsenSys/solidity- parser. 
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% AddSecurityPatterns v0.1.0 


Locking 
Transition counter 
Timed transitions 
Access control 


Events 


Save these settings in the current user Cane Gisa 


Fig. 3. Running the AddSecurityPatterns. 


Transition Counter: If multiple functions calls are invoked around the same 
time, then the order in which these calls are executed on the Ethereum blockchain 
may be unpredictable. Hence, when a user invokes a function, she may be unable 
to predict what the state and the values stored within a contract will be when 
the function is actually executed. This issue has been referred to as “transaction- 
ordering dependence” [5] and “unpredictable state” [10], and it can lead to var- 
ious security vulnerabilities. 

We provide a plugin that prevents unpredictable-state vulnerabilities by 
enforcing a strict ordering on function call executions. The plugin expects a 
transition number in every function as a parameter and ensures that the num- 
ber is incremented by one for each function execution. As a result, when a user 
invokes a function with the next transition number in sequence, she can be sure 
that the function is executed before any other state changes can take place. 


Automatic Timed Transitions: We provide a plugin for implementing time- 
constraint patterns. We extend our language with timed transitions, which are 
similar to non-timed transitions, but (1) their guards and assignments do not use 
input or output data and (2) they include a number specifying transition time. 

We implement timed transitions as a modifier that is applied to every func- 
tion, and which ensures that timed transitions are executed automatically if their 
time and data guards are satisfied. Writing such modifiers manually could lead 
to vulnerabilities. For example, a developer might forget to add a modifier to a 
function, which enables malicious users to invoke functions without the contract 
progressing to the correct state (e.g., place bids in an auction even though the 
auction should have already been closed due to a time limit). 


Access Control: In many contracts, access to certain transitions (i.e., func- 
tions) needs to be controlled and restricted. For example, any user can participate 
in a typical blind auction by submitting a bid, but only the creator should be 
able to cancel the auction. To facilitate the enforcement of such constraints, we 
provide a plugin that (1) manages a list of administrators at runtime (identified 
by their addresses) and (2) enables developers to forbid non-administrators from 
accessing certain functions. 
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4 Conclusion and Future Work 


Blockchain-based decentralized computing platforms with smart-contract func- 
tionality are envisioned to have a significant technological and economic impact 
in the future. However, if we are to avoid an equally significant risk of security 
incidents, we must ensure that smart contracts are secure. To facilitate the devel- 
opment of smart contracts that are secure by design, we created the FSolidM 
framework, which enables designing contracts as FSMs. Our framework is rooted 
in rigorous yet clear semantics, and it provides an easy-to-use graphical editor 
and code generator. We also implemented a set of plugins that developers can 
use to enhance the security or functionality of their contracts. In the future, we 
plan to integrate model checkers and compositional verification tools into our 
framework [12,13] to enable the verification of security and safety properties. 
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Abstract. An ongoing challenge with differentially private database 
systems is that of maximizing system utility while staying within a 
certain privacy budget. One approach is to maintain per-user budgets 
instead of a single global budget, and to silently drop users whose budget 
is depleted. This, however, can lead to very misleading analyses because 
the system cannot provide the analyst any information about which users 
have been dropped. 

This paper presents UniTraX, the first differentially private system 
that allows per-user budgets while providing the analyst information 
about the budget state. The key insight behind UniTraX is that it tracks 
budget not only for actual records in the system, but at all points in the 
domain of the database, including points that could exist but do not. 
UniTraX can safely report the budget state because the analyst does not 
know if the state refers to actual records or not. We prove that UniTraX 
is differentially private. UniTraX is compatible with existing differen- 
tially private analyses and our implementation on top of PINQ shows 
only moderate runtime overheads on a realistic workload. 


1 Introduction 


Differential Privacy (DP) is a model of anonymity that measures privacy loss 
resulting from queries made to a database [6]. A bound on privacy loss can be 
enforced by preventing queries after a privacy budget has been exceeded. An 
ongoing challenge with DP systems is that of maximizing system utility while 
staying within a privacy budget, where system utility is measured in terms of 
both number of queries and amount of distortion (noise) in query answers. 

A simple but common approach to DP budgets is to maintain a single global 
budget. With this approach, all queries draw from the budget regardless of how 
many user records are used to answer a given query. In systems where users 
can specify their own individual budgets, the global budget is effectively the 
minimum of user budgets. 


© The Author(s) 2018 
L. Bauer and R. Kiisters (Eds.): POST 2018, LNCS 10804, pp. 278-299, 2018. 
https: //doi.org/10.1007/978-3-319-89722-6_12 
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An alternative approach is to maintain per-user budgets. The idea here is that 
a given query draws only from the budgets of users whose records contribute to 
the answer. This can substantially improve system utility. An analysis that for 
instance targets smokers in a medical dataset would not reduce the budgets of 
non-smokers. Furthermore, per-user budgets maximize utility in systems where 
users specify their individual budgets because low-budget users do not constrain 
the queries that are made over only high-budget users. 

In spite of the tremendous potential for increasing the utility of DP systems, 
we are aware of only a single system, ProPer [10], that tracks per-user budgets.+ 
This is because of a fundamental difficulty with per-user budget systems. Namely, 
the system cannot report on the remaining budget of individual users without 
revealing private information. If budgets were made public in this way, then an 
analyst could trivially obtain information about users just by observing which 
users’ budgets changed in response to a query. 

Because of this, ProPer keeps user budgets private: it silently drops the record 
of a user from the dataset when the user’s budget is depleted. This creates a 
serious usability problem for the analyst. Suppose there are two analysts, Alice 
and Bob. Alice wishes to learn about smokers, Bob wishes to learn about lung- 
cancer patients. Suppose Alice makes a set of queries about smokers, and as 
a result many smokers’ budgets are depleted and these smokers’ records are 
dropped from the dataset. Afterwards Bob asks the question: “What fraction of 
lung cancer patients are smokers?”. Because many smokers have been dropped 
from the dataset, and non-smokers have not, Bob’s answer is incorrect. Worse, 
Bob has no way of knowing whether the answer is incorrect, or how incorrect it 
is. Bob’s answer is effectively useless. We call this unknown dataset bias. 

To address this problem, this paper presents UniTraX, a DP system that 
allows for the benefits of keeping per-user budgets without the disadvantage 
of unknown dataset bias. The key insight of UniTraX is in how it tracks bud- 
get. Rather than privately tracking individual users’ remaining budget, UniTraX 
publicly tracks the budget consumed by prior queries over regions of the data 
parameter space. In addition, UniTraX adds each user’s initial budget to the 
dataset, making it a queryable parameter. 

For example, assume a query asks for the count of users between the ages 
of 10 and 20. ProPer would privately deduct the appropriate amount from the 
individual remaining budget of all users in that age range. By contrast, UniTraX 
publicly records that a certain amount of budget was consumed for the age 
range 10-20. Because the consumed budget is public, the analyst can calculate 
how much initial budget any given point in the data parameter space would 
need in order to still have enough remaining budget for some specific query 
the analyst may wish to make. Because initial budgets are also a queryable 
parameter, the analyst can then explicitly exclude from the query any points 
whose initial budget is too small. This allows the analyst to control which points 
are included in answers and therefore avoid unknown dataset bias. (See Sect. 2 
for a detailed example.) 


1 Other DP systems also permit per-user or per-field initial budgets [1,15]. However, 
these systems do not track the consumption of budget on a per-user basis. 
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Fig. 1. System comparison 


Internally, UniTraX utilizes the same calculation of required initial budget 
to reject any query that covers points without sufficient budget. Critically, such 
a rejection does not leak any private information as it solely depends on public 
budget consumption data and query parameters. In fact, the decision to reject 
a query does not even look at the actual data. 

A significant practical concern is that tracking budgets across the entire 
parameter space, which will usually be substantially larger than the number 
of actual records in the database, can be quite expensive. To understand this 
cost, we built a prototype implementation of UniTraX on top of PINQ [17]. 
By carefully clubbing budgets over contiguous regions of the parameter space, 
we obtain average overheads of less than 80% over a no-privacy baseline on a 
realistic workload. 

The contributions of this paper are threefold: 


1. A system model and design that maintains the advantages of per-user privacy 
budgets, while avoiding the problems due to unknown dataset bias. 

2. A theoretical framework and proof that the design provides DP. 

3. An implementation and evaluation showing that the system is able to effi- 
ciently track budgets with average overheads of less than 80%. 


In Sect. 2 we compare different system models for DP and provide an example 
to illustrate the effect of unknown dataset bias. We introduce the design of 
UniTraX in Sect. 3 and detail the theoretical framework and the proof of DP in 
Sect. 4. Our implementation and its evaluation are presented in Sects. 5 and 6. 
We discuss related work in Sect. 7 and conclude in Sect. 8. 


2 System Comparison 


To better understand the differences and advantages of UniTraX, we start with 
overviews of UniTraX and two prior system models, the classic DP “reference” 
model with a global budget, and ProPer with private per-user budgets. We use 
a simple running example to illustrate the differences. Figure 1 contrasts the 
public, per-user budget model of UniTraX with DP reference and ProPer. 

For the example we assume that two analysts Alice and Bob want to analyze 
a dataset of patient records. These records contain a variety of fields among 
which is one that indicates whether a patient is a smoker, and one that indicates 
whether the patient suffers from lung cancer. We assume that Alice is interested 
in smokers and wants to run various queries over different fields of smokers while 
Bob is interested in the fraction of lung cancer patients that are smokers. We 
assume that Alice does her analysis first, followed by Bob. 
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Regarding the setting of each patient’s (user’s) initial budget, we consider 
two cases: (1) all initial budgets are the same (uniform initial budgets), and 
(2) each budget is set by the user (non-uniform initial budgets). In the case of 
UniTraX, the initial budget is just another field in each record. 


DP Reference. The DP reference mechanism uses a publicly visible global bud- 
get. In the case of uniform initial budgets, the global budget is set as the system 
default. In the case of non-uniform initial budgets, the global budget is set to 
the lowest initial budget among all users. 

The reference mechanism counts every query against this single global bud- 
get. First, Alice runs her queries against smokers. Since each query decrements 
from the global budget, this budget may well be depleted before Bob can even 
start. At this point no information about non-smokers will have left the system. 
Still, the system has to reject all further queries. 


ProPer. ProPer tracks one budget per user but must keep it private. Users whose 
budgets are depleted are silently dropped from the dataset and not considered 
for any further queries. Nevertheless, each user’s full budget can be used. 

Staying in our example, Alice’s queries use no budget of non-smokers under 
this tracking mechanism. Once Alice has finished her queries, Bob starts his 
analysis. Bob wishes to make two queries, one counting the number of smokers 
with lung cancer, and one counting the number of non-smokers with lung cancer. 
Bob may look at Alice’s queries, and observe that she focused on smokers, and 
therefore know that there is a danger that his answers will be biased against 
smokers. In the general case, however, he cannot be sure if his answers are 
biased or not. 

In the case of uniform budgets, if Alice requested histograms, then she would 
have consumed the smokers’ budgets uniformly and depleted either all or none 
of the smokers’ budgets. If Bob gets an answer that, keeping in mind the noise, 
is significantly larger than zero, then Bob’s confidence that his answer is non- 
biased may be high. If on the other hand Alice focused some of her queries on 
specific ranges (e.g., certain age groups), or if budgets are non-uniform, then 
Bob knows that the answer for smokers with lung cancer may be missing users, 
while the answer for non-smokers with lung cancer will not. He may therefore 
have unknown dataset bias, and cannot confidently carry out his analysis. 


Our System (UniTraX). UniTraX tracks public budgets that are computable 
from the history of previous queries. UniTraX is able to tell an analyst how 
much budget has been consumed by previous queries for any subspace of the 
parameter space. For example, the analyst may request how much budget has 
been consumed in the subspace defined by “age>10 AND age<20 AND gen- 
der=male AND smoker=1”. 

UniTraX tracks budget consumption over regions of the parameter space. For 
example, if a query selects records over the subspace “age>10 AND age<20”, 
then UniTraX records (publicly) that a certain amount of budget has been con- 
sumed from this region of the parameter space. Initial budgets are an additional 
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dimension of the parameter space in UniTraX. In particular, the initial budget 
of an actual record in the database is stored in a field in the record. By com- 
paring the (public) consumed budget of any point in the parameter space to 
the initial budget of that point, UniTraX can determine publicly whether that 
point’s budget has been fully consumed or not. This allows UniTraX to reject a 
query safely: If, after the query, the consumed budget of any point selected by 
the query will exceed that point’s initial budget, then the query is immediately 
rejected. This decision does not require looking at the actual data, and reveals 
no private information. 

Critically, public consumed budgets combined with the ability to filter queries 
based on users’ initial budgets allows analysts to control and eliminate unknown 
dataset bias. Returning to our example, when Bob is ready to start his analysis, 
he queries UniTraX to determine the consumed budgets for “smoker=1 AND 
disease=lungCancer”, and “smoker=0 AND disease=lungCancer”. Because no 
queries have been made for non-smokers, the consumed budget of the latter 
query’s region would be zero. Suppose that UniTraX indicates that the consumed 
budget for the region “smoker=1 AND disease=lungCancer” is 50, and that 
Bob’s two queries will further consume a budget of 10 each. Because the two 
groups are disjoint, Bob knows that any user with an initial budget of 60 or 
higher has enough remaining budget for his queries. (If the two queries were not 
known to have disjoint user populations, then Bob would need to filter for initial 
budgets of 70 or higher.) 

Bob generates the following two queries: 


— “count WHERE smoker=1 AND disease=lungCancer AND initBudget>60”, 
— “count WHERE smoker=0 AND disease=lungCancer AND initBudget>60”. 


In doing so, Bob is assured that no users are excluded from either query, and 
avoids unknown dataset bias.” 

So far, we have described how Bob may query only points with sufficient 
remaining budget. However, when this is not the case, UniTraX is able to simply 
reject Bob’s queries. In fact, UniTraX can even inform him about which points 
are out of budget without leaking private information. Privacy is protected by 
the fact that Bob does not know whether these points exist in the dataset or 
not. UniTraX’s rejection does not reveal this information to Bob as it solely 
depends on public consumed budgets and query parameters. Using the returned 
information, Bob is able to debug his analysis and retry. 

UniTraX not only allows analysts to debug their analyses but is fully com- 
patible with existing DP systems. Any analysis that successfully executes over a 
dataset protected by a global budget system requires only a simple initialization 


? Note that if users select their own initial budgets, and there is some correlation 
between user attributes and initial budgets, then there may still be a specific bias in 
the data. For instance if smokers tend to choose high budgets and non-smokers tend 
to choose low budgets, then Bob’s queries would be biased towards smokers. This 
problem appears fundamental to any system that allows individual user budgets. 
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to run on the same dataset protected by UniTraX (see Sect. 5 for PINQ-based 
analyses). Thus, analysts can easily adapt to UniTraX and exploit the increased 
utility of per-user budgets. 


3 Design Overview 


Threat Model. UniTraX uses the standard threat model for DP. The goal is 
to prevent malicious analysts from discovering whether a specific record (user) 
exists in the queried database (dataset). We assume, as usual, that analysts are 
limited to the interface offered by UniTraX and that they do not have direct 
access to the database. We make no assumptions about the background or aux- 
iliary knowledge of the analysts. Analysts may collude with each other offline. 


Goals. We designed UniTraX with the following goals in mind. 


Privacy: Users should be able to set privacy preferences (budgets) for their 
records individually. These preferences must be respected across queries. 


Utility: Querying a parameter subspace should not affect the usability of records 
in a disjoint subspace. 


Bias Discovery: The system should allow the analyst to discover when there 
may be a bias in query answers because privacy budgets of some parts of the 
parameter space have been depleted by past queries. 


Efficiency: The overhead of the system should be moderate. 


In the following we describe the design of UniTraX, explaining how it attains 
the first three goals above. The fourth goal, efficiency, is justified by the experi- 
mental evaluation in Sect. 6. 


Design Overview. For simplicity, we assume that the entire database is organized 
as a single table with a fixed schema. The schema includes a designated column 
for the initial privacy budget of each record. UniTraX is agnostic to how this 
initial budget is chosen—it may be a default value common to all records or 
it may be determined individually for each record by the person who owns the 
record. Higher values of initial budget indicate less privacy concerns for that 
record. Records may be added to the database or removed from it at any time. 
The set of all possible records constitutes the parameter space.? We use the 
term point for any point in the parameter space; a point may or may not exist 
in the actual database under consideration. We use the terms actual record and 
record for the points that actually exist in the database under consideration. 
Like most DP systems, UniTraX supports statistical or aggregate queries. 
The query model is similar to that of PINQ [17]. An analyst performs a query 
in two steps. First, the analyst selects a subspace of the parameter space using 
a SQL SELECT-like syntax. For example, the analyst may select the subspace 
“age>10 AND age<20 AND gender=male AND smoker=1”. Next, the analyst 


3 The parameter space is also sometimes called the “domain” of the database. 
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runs an aggregate query like count, sum or average on this selected subspace. 
To protect privacy, UniTraX adds random noise to the result of the query. The 
amount of noise added is determined by a privacy parameter, £, that the analyst 
provides with the query. For lower values of £, the result is more noisy, but the 
reduction of privacy budget is less (thus leaving more budget for future queries). 

The novel aspect of UniTraX is how it tracks budgets. When an aggregate 
query with privacy parameter € is made on a selected subspace S, UniTraX 
simply records that budget £ has been consumed from subspace S. The remaining 
budget of any point in the parameter space is the point’s initial budget (from 
the point’s designated initial budget field) minus the e’s of all past queries that 
ran on subspaces containing the point. 

The consumed budgets of all subspaces are public—analysts can ask for them 
at any time. This allows analysts to determine which subspaces have been heavily 
queried in the past and, hence, become aware of possible data biases. Moreover, 
analysts may select only subspaces with sufficient remaining budgets in subse- 
quent queries, thus increasing their confidence in analysis outcomes, as illustrated 
in Sect. 2. 

To respect privacy budgets, it is imperative that a query with privacy param- 
eter € does not execute on any points whose remaining budget is less than e€. 
This is enforced by query rejection, where a query is executed only if all points 
in the selected subspace have remaining budget at least €. Note that this check 
is made on not only actual records but all points in the selected subspace. If 
any such point does not have sufficient remaining budget, the query is rejected 
and an error is returned to the analyst (who may then select a smaller subspace 
with higher initial budgets and retry the query). Whether a query is executed or 
rejected depends only on the consumption history, which is public, so rejecting 
the query provides no additional information to the analyst. 


Initial Budgets. UniTraX is agnostic to the method used to determine initial 
budgets of actual records and supports any scheme for setting initial budgets on 
actual records. The simplest scheme would assign the same, fixed initial budget 
to every actual record. A more complex scheme may allow users to choose from 
a small fixed set of initial budgets for each record they provide, while the most 
complex scheme may let users freely choose any initial budget for every record. 


4 Formal Description and Differential Privacy 


In this section, we describe UniTraX using a formal model. We specify the dif- 
ferential privacy property that we expect UniTraX to satisfy and formally prove 
that the property is indeed satisfied. Our formalization is directly based on 
ProPer’s formalization [10], which we find both elegant and natural. 


4.1 Formal Model of UniTraX 


Database. We treat the database as a table with n columns of arbitrary types 
Ci,...,Cp and an initial budget column—a non-negative real number. The type 
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of each record, also called the parameter space, is R = C1 x... Cn X Cg, where 
Cg = RŽ? is the type of the initial budget column. At any point of time, the 
state of the database is a set E of records from the parameter space (E € 2°). 


UniTraX. UniTraX acts as a reference monitor between the database and the 
analyst. Its internal state consist of two components: (1) the consumption history 
H and (2) the select table T. 


1. UniTraX tracks the budget consumed by past queries on every subspace of 
the parameter space. Formally, this is equivalent to storing a map from points 
in the parameter space to non-negative real numbers. We call this map the 
consumption history, denoted H. H has the type H = R > R2°. Intuitively, 
H(r) is the amount of budget consumed by past queries that ran on subspaces 
containing the point r of the parameter space. 

2. To run an aggregate query in UniTraX, the analyst must first select a sub- 
space of the parameter space. To support selection of records that have at 
least a stipulated remaining budget, UniTraX allows selected subspaces to also 
span the consumption history. Consequently, a selected subspace is a subset 
of R x R2° (points extended with their consumed budgets). We represent such 
subspaces via logical predicates sspace of type P = R x RŽ? — {true, false}. 
For the analyst’s convenience, UniTraX allows storing a list of selected sub- 
spaces, indexed by subspace variables drawn from a set SVar. UniTraX stores 
the association between subspace variables and subspaces in a select table, T, 
of type SVar > P. 


Analyst. We model an adaptive analyst, who queries UniTraX based on an inter- 
nal program and previously received answers. Formally, the analyst is a (possibly 
infinite) state machine with states denoted by P and its decorated variants, and 
state transitions defined by the relation P — P’. Here a,b denote interactions 
between the analyst and UniTraX. Allowed interactions are summarized in Fig. 2. 
Note that interactions consist of either an instruction to, or an observable output 
from UniTraX, or both. In detail, the interactions are: 


— sv := sspace represents the instruction to UniTraX to associate the subspace 
variable sv with the subspace sspace, which must be in P. This models the 
selection of a subspace (for use in later aggregation queries). 

— Q.(sv)?n models the instruction to UniTraX to run the aggregation query Q 
with privacy parameter € on the subspace previously mapped to variable sv. 
The interaction also includes the noised result n of the query. If some point 
in subspace sv has remaining budget less than €, the output n is ‘reject’. 

— update represents an output from UniTraX to the analyst indicating that the 
database has been updated. The output does not specify which records were 
added or deleted (else the analyst could trivially break DP). 

— read? H models reading the entire current consumption history by the analyst. 
H is the history returned by UniTraX. 
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a,b ::= sv := sspace select subspace sspace and name it sv 
Q.(su)?n run aggregation query Q on sv, observe output n 
update database update 
read? H read consumption history, result is H 


Fig. 2. Allowed interactions between analyst and UniTraX 


We make no assumptions about the analyst (i.e., its state machine). 
It may select any subspace, run any aggregation query, and read the consump- 
tion history at any time. However, for technical reasons we assume (like ProPer) 
that the analyst is internally deterministic and deadlock-free, meaning that it 
branches only on observable output from the database and that it can always 
make progress. Our assumptions are formalized by the following condition: 
If P- P’ and P + P”, then 


1. if a = b then P' = P” 

2. if a = (sv := sspace) then a = b 

3. if a= Q.(sv)?n then b = Q,(sv)?n’ for some n’ and for all n” there 
exists P” with P eC", pm 

4. if a = read? H then b = read? H’ for some H’ and for all H” there 
exists P” with P2202", pm 


Configuration. A configuration C = (P, E, H,T) represents the state of the 
complete system. It includes the state of the analyst (P), the database of actual 
records (E) and the internal state of UniTraX (consumption history H and select 
table T). 


Execution Semantics. We model the evolution of the system using transitions 
C =, C’. Here, a € Act denotes an action label describing an operation within 
the system and p is a transition probability (real number between 0 and 1). The 
transition C =, C’ reads as follows: If, in configuration C, the operation a 
happens, then, with probability p, the configuration changes to C’. a may be 
any one of: 


— T: analyst selects a subspace 

— n € Val: query by analyst that returns result n 

— reject: query by analyst that is rejected 

— Rin : Raer: database update that adds records Rin and removes records Rei 
— H: analyst reads consumption history H 


The transition system C =, C’ is defined by the five rules shown in Fig. 3. 
These rules model the system’s behavior as follows. 


(UPDATE) Models a database update by adding some record set Rin and remov- 
ing some record set Rae; from the database Æ. This transition returns to the 
analyst the observable output ‘update’ (first premise). 


4 These restrictions do not affect the analyst’s attack capabilities. 
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UPDATE 
P 


update 


P’ Rin, Rael CR 
(P, E, H, T) Sees (P', (EU Rin) \ Raen, H, T) 


SELECT READ-HISTORY 
P sv:=sspace p' sspace € P P read? H p' 
(P, E, H, T) 1 (P', E, H, T|sv := sspace]) (P, E, H, T) Æ; (P', E, H,T) 
QUERY 
Q,.(sv)?n , 
P——+ P sspace := T (sv) € P 


Vr € R.sspace(r, H(r)) > H(r)+e < r.cg p = Prob[Q:(E|sspace,g) = n] 
(P, E, H,T) 4p (P', E, H',T) 


H(r)+e if sspace(r, H(r)) 


where H'(r) := 
H(r) otherwise 


and E|sspace, H q {r € E | sspace(r, H(r))} 


REJECT 
Q, (sv)? reject 


P P’ 
sspace := T(sv) E€ P dr € R.sspace(r, H(r)) ^ H(r) +€ > rcp 


(P, E, H,T) Sit (P',E,H,T) 


Fig. 3. Semantics of UniTraX 


(SELECT) Represents the analyst’s selection of subspace sspace, naming it sv. 


(READ-HISTORY) Denotes the analyst reading the current consumption history 
H. This rule forces our privacy proofs to internally show that the consumption 
history is indeed public. 

(QUERY) Models the successful execution of aggregation query Q on subspace 
sspace identified by sv with privacy parameter ¢. The execution requires all 
points in sspace to have a remaining budget of at least €. A point r is in sspace 
if sspace(r, H(r)) = true. (In the rule, r.cg is short-hand for the initial budget 
column of point r.) As a consequence of the query, two things happen. First, the 
consumption history of all points in the subspace is increased by £, to record that 
a query with privacy parameter £ has run on the subspace. Second, the answer to 
query Q executed over those records that are both in the subspace and actually 
exist in the database E (selected by the operation E|sspace,#) is returned to 
the analyst after adding differentially private noise for the parameter £. The 
transition’s probability p is equal to the probability of getting the specific noised 
answer for the query (the noised answer is denoted n in the rule). 

(REJECT) Represents UniTraX’s rejection of query Q due to some point in the 
query’s selected subspace not having sufficient remaining budget. The analyst 
observes a special response ‘reject’ (first premise). 
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With the notable exception of (QUERY), all rules are deterministic—they 
happen with probability 1 (the p in =, is 1). 


Trace Semantics. The relation C =, C’ describes a single step of system evo- 
lution. We lift this definition to multiple steps. A trace o is a (possibly empty) 
finite sequence of labels a1,...,@n. We write C 5; C’ to signify that config- 
uration C evolves in multiple steps to configuration C’ with probability q. The 
individual steps of the evolution have labels in ø. Formally, we have: 


C=C C5, 0” 


C ale C C => p4 Cc” 


We abbreviate C =, C to C 5; when C’ is irrelevant. 

Note that from the transition semantics (Fig.3) it follows that a trace ø 
records all updates to the database and all observations of the analyst (the 
latter is comprised of all responses from UniTraX to the analyst). 


Extension to Silent Record Dropping. Up to this point, our design rejects a query 
whose selected subspace includes at least one point with insufficient remaining 
budget. This protects user privacy and prevents unknown dataset bias. However, 
in some cases, an analyst might prefer the risk of unknown dataset bias over 
modifying their existing programs to handle query rejections. This might be 
the case, for instance, if the analyst already knows by other means that the 
percentage of records with insufficient budget will be negligible. In this case, it 
would be preferable to automatically drop records with insufficient budget during 
query execution, as in ProPer. It turns out that we can provide silent record 
dropping without weakening the privacy guarantee. In the following paragraph, 
we detail a simple extension of UniTraX that allows the analyst to specify for 
each query individually whether the system should silently drop records with 
insufficient remaining budgets instead of rejecting the query. 

In order to enable silent record dropping, we introduce an extended 
query interaction Q¢'°?(sv)?n for the analyst’s program. Unlike the previously 
described interaction, Q.(sv)?n, this interaction cannot fail (be rejected). The 
semantics of Q!P(sv)?n is defined by the new rule (QUERY-DRoP) shown in 
Fig. 4. The query executes on those records in database E that (1) are in sub- 
space sspace, and (b) have remaining budget at least £. These records are selected 
by E||sspace, H,- AS a consequence of the query, two things happen. First, the 
consumption history of all points in the parameter space satisfying (1) and (2) is 
increased by £. Second, the answer of the query is returned to the analyst with 
probability p, which is determined by the same method used in (QUERY). 


4.2 Privacy Property and Its Formalization 


UniTraX respects the initial privacy budget of every record added to the database 
in the sense of differential privacy. Before explaining this property formally, we 
recap the standard notion of differential privacy due to Dwork et al. [6]. 
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QUERY-DROP 


droP (sy) ?n 
P ed P’ sspace := T(sv) € P p= Prob[Q?”? (Ell sspace,H,¢) =n] 


(P, E,H,T) >, (P', E, H',T) 


H(r)+e if sspace(r, H(r)) A H(r) +€ < r.cg 


where H'(r) := 
H(r) otherwise 


and E||sspace, H,e Eg {r € E | sspace(r, H(r)) A H(r) +e < r.cg} 


Fig. 4. Semantics extension for silent record dropping 


Standard Differential Privacy. Let Q be a randomized algorithm on a database 
that produces a value in the set V. For example, the algorithm may compute 
a noisy count of the number of entries in the database. We say that Q is e- 
differentially private if for any two databases D, D’ that differ in one record and 


for any V’ CV, 

[In ( Pr |Q(D) € V'] ) 

Pr [Q(D') ev] 

In words, the definition says that for two databases that differ in only one record, 
the probabilities that the analyst running Q makes a specific observation are very 
similar. This means that any individual record does not significantly affect the 
probability of observing any particular outcome. Hence, the analyst cannot infer 
(with high confidence) whether any specific record exists in the database. 

If the analyst runs n queries that are £1-, ..., €,-differentially private, then 
the total loss of privacy is defined as £1 +.. .+En. Typically, a maximum privacy 
budget is set when the analyst is given access to the database and after each e- 
differentially private query, £ is subtracted from this budget. Once the budget 
becomes zero, no further queries are allowed. In this mode of use, DP guarantees 
that for any two possible databases D, D’ that differ in at most one record, for 
any sequence of queries Q, and for any sequence of observations o, 


Pr [Q results in o on D] 
In R 7 z n, 
Pr [Q results in o on D’) 


<E. 


where 77 is the privacy budget. 


Our Privacy Property. We use the same privacy property as ProPer. This pri- 
vacy property generalizes differential privacy described above by accounting for 
dynamic addition and deletion of records and, importantly, allowing all new 
records to carry their own initial budgets. Informally, our privacy property is 
the following. Consider two possible traces og and gı that can result from the 
same starting configuration. Suppose that co and gı differ only in the updates 
made to the database and are otherwise identical. Let pp and pı be the respective 


probabilities of the traces. Then, |in (2) | < n, where n is the sum of the initial 
budgets of all records in which the database updates differ between co and g1. 
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Viet ny dist (ai, a) if o = @1,...,@n and o’ =a},...,ai, 
l def . 
dist(o,0’) = 4 (Rin ARin) U (Rat ARha) if o = Rin : Rae and o’ = Rip : Rigi 
0 if o =o’ 


Fig. 5. Trace distance 


Why is this a meaningful privacy property? We remarked earlier that a trace 
records all observations that the analyst (adversary) makes. Consequently, by 
insisting that the traces agree everywhere except on database updates, we are 
saying that the two traces agree on the analyst’s observations. Hence, if an 
analyst makes a sequence of observations under database updates from oo with 
probability po, then the probability that the analyst makes the same observations 
under database updates from aj is very close to po. In fact, the log of the ratio of 
the two probabilities is bounded by the sum of the initial budgets of the records 
in which the updates differ. This is a natural generalization of DP’s per-database 
budgets to per-record budgets. 

To formalize this property, we define a partial function dist(c, 0’) that returns 
the set of records in which database updates in ø and o’ differ if o and o’ agree 
pointwise on all labels other than database updates. If o and o’ differ at a label 
other than database update then dist(o,o') is undefined. The formal definition 
is shown in Fig. 5. 


Definition 1 (Privacy). We say that UniTraX preserves privacy if whenever 
C n and C 5n and dist(o9,01) = R, then In ()| < 5 r.CB. 
rER 


Our main result is that UniTraX is private in the sense of the above definition. 


Theorem 1 (Privacy of UniTraX). UniTraX preserves privacy in the sense 
of Definition 1. 


We prove this theorem by first proving a strong invariant of configurations 
that takes into account how UniTraX tracks the consumption history. The entire 
proof is in our technical report [19]. 


5 Implementation 


We have implemented UniTraX on top of PINQ, an earlier framework for enforc- 
ing differential privacy with a global budget for the database [17]. We briefly 
review relevant details of PINQ before explaining our implementation. 


PINQ Review. PINQ adds differential privacy to LINQ, a general-purpose data- 
base query framework. LINQ defines Queryable objects, abstractions over data 
sources, e.g., a database table. The Queryable object may be transformed by a 
SQL SELECT-like operation to obtain another Queryable object representing 
selected records from the table. One may run an aggregate query on this second 
object to obtain a specific value. 
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Building on LINQ, PINQ maintains a global privacy budget for the entire 
database. This budget is set when a Queryable object is initialized. Subse- 
quently, differentially-private noise is added to every aggregation query on every 
object derived from this Queryable object and the global budget is appropriately 
reduced. 


UniTraX Implementation. Our implementation currently supports only query 
execution with rejection. The main addition to PINQ is tracking of consump- 
tion budgets over subspaces. In principle, we must store the consumption budget 
for every point in the parameter space. In practice, queries tend to select contigu- 
ous ranges, so at any point of time, the parameter space splits into contiguous 
subspaces, each with a uniform consumption budget. Accordingly, our imple- 
mentation tries to cluster contiguous subspaces with identical consumption and 
represents them efficiently. 

Our interface defines a new object type, UQueryable, which represents a 
subspace. Like Queryable, this object can be transformed via SQL SELECT-like 
operations to derive other, smaller UQueryable objects. To run an analysis on 
a subspace, the analyst invokes a special function, GetAsPINQ, to convert a 
UQueryable object representing the subspace into a PINQ object representing 
the same subspace. This special function also takes as an argument a budget, 
which the analysis will eventually consume. The function first checks that this 
budget is larger than the remaining budget of all points in the subspace. If not, 
the function fails. Otherwise, this budget is immediately added to the consump- 
tion budget of the subspace and a fresh PINQ object initialized with this budget 
is returned. Subsequently, the analyst can run any queries on the PINQ object 
and PINQ’s existing framework enforces the allocated budget. 

We also provide a new interface to the analyst to ask for the maximum budget 
consumed in a given subspace. 


Typical Analysis Workflow. We briefly describe the steps an analyst must follow 
to run an analysis on our implementation. Assume that the analyst wants to 
analyze records within a specific subspace with a set of queries that require a 
certain amount of budget to run successfully. Further assume that the analysis 
needs to run on a stipulated minimum number of user records for its results to 
be meaningful. The analyst would perform the following steps: 


Obtain the initial UQueryable object representing the entire database. 

Select the desired subspace obtaining another UQueryable object. 

Obtain the maximum budget consumed on the second object. 

Add the budget required for the analysis and a budget for a noisy count to 

the just-obtained maximum budget. 

5. Select the subspace that has at least the just-calculated sum of budgets avail- 
able, obtaining yet another UQueryable object. 

6. Obtain a PINQ object from the last UQueryable object with the PINQ budget 
set to the budget of the count. 

7. Perform a (noisy) count on the PINQ object. If it is too low, stop here. 


a acl a 
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8. Otherwise, obtain another PINQ object, this time with the budget required 
for the analysis. 

9. Perform the analysis on the second PINQ object. All records in the PINQ 
object have enough budget for the full analysis. 


Data Stream Analysis. UniTraX can be directly used for analysis on streams of 
data since its design and privacy proof already take record addition and deletion 
into account. To allow analysts to use the full budget of newly arriving records, 
we assume records to be timestamped on arrival; this timestamp is another 
column in our parameter space. At any time, all active analyses use points with 
timestamps in a specific window of time only. When the budgets of points in 
the window run out, the window is shifted to newer timestamps. Records with 
timestamps in the old window can be discarded. All analyses share the budgets 
of points in the active time window. 


6 Preliminary Evaluation 


This section presents a preliminary evaluation of the performance of our imple- 
mentation of UniTraX. It is preliminary in that (1) it uses only one dataset (the 
New York City taxi ride dataset [18,21]), and (2) we carry out only one “anal- 
ysis session”. The session consists of queries that perform the basic statistical 
operations of count, average, and median. 


Objective. Of primary interest to us is the increase in end-to-end latency expe- 
rienced by the analyst (time from query submission to answer reception) as 
compared to both PINQ (reference DP) and LINQ (baseline that provides no 
privacy). Additionally, we want to understand the overhead of storing UniTraX’s 
budget consumption history data structure. 

In absolute terms, these overheads are a function of the access pattern on 
the parameter space. The exact column names, the data in them or the precise 
queries do not matter for this. Nonetheless, we briefly describe the dataset we 
use and the queries we run. The queries are deliberately chosen to be simple 
since long-running, complex queries will mask UniTraX’s relative overheads. 


Dataset. We use all taxi rides of New York City reported for January 2013 
(~14M records). We modify these records to only contain numerical data and 
add an additional initial budget for each. For the purpose of our measurements 
all budgets are chosen high enough so that no budgets expire. 


Analysis Session. Our session is roughly patterned off of the analysis of the 
same dataset described in [12]. The session consists of 1213 queries split into 
three groups. The first group covers the entire geographic area, and consists of 
six histograms for different columns. The subsequent groups focus ona 16 x 16 
grid of squares in Manhattan. The second group of queries counts the number of 
rides in each square, and takes averages over two different columns for squares 
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that have more than 5000 rides with sufficient budget. The third group counts 
rides again and takes the median of one column for squares that have more than 
1000 rides with sufficient budget. 


Experimental Setups. We run the session over each of the following three setups: 


1. Directly on LINQ using the LINQ-to-SQL interface (no privacy protection). 
2. Through a PINQ object (DP protection with a global budget). 
3. With UniTraX. 


All numbers presented in this section are averages of five runs of the session. 


Hardware. All experiments run on two identical commodity Dell PowerEdge 
M620 blade systems. Each is equipped with dual Intel Xeon E5-2667 v2 8- 
core CPUs with Hyperthreading (total of 32 hardware threads per machine) and 
256 GB of main memory. Both systems are connected to the same top-of-rack 
switch with two bonded 1 Gbit/s connections each. 


Software. We use Microsoft Windows Server 2016 on both systems. The first 
system runs both UniTraX as well as the client query program. Microsoft Visual 
Studio Community 2015 is the only additional software installed for these tasks. 
The second system runs Microsoft SQL Server 2016 Developer Edition as the 
remote database server. To optimize database performance we put data and 
index files of our database onto a RAM-disk, create indexes that fit our queries, 
and make the database read-only. 


Absolute and Relative Latency Overheads. Figure 6 presents absolute end-to-end 
latencies for the three experimental setups: direct, only PINQ, and UniTraX. A 
random 5% sample of the 1213 queries is shown, sorted on the x-axis by increas- 
ing latency with respect to the direct experiment. Overheads are moderate. As 
expected, UniTraX is usually slower than PINQ, which is slower than direct 
query execution without any privacy protection. In 3.2% of the cases, UniTraX 
outperforms direct and PINQ. We verified that in these cases the database server 
chose to do a sequential table scan for direct and PINQ but a parallel and thus 
faster index scan for UniTraX. We were unable to force parallel execution for 
direct and PINQ. 

Figure 7 presents a CDF for all 1213 queries in terms of the overhead of 
UniTraX relative to direct and PINQ respectively. We observe that in half of the 
cases, UniTraX is 1.5x slower than PINQ and 2x slower than the direct case. At 
the 99th-percentile UniTraX is 2.5x slower than PINQ and 3.5x slower than the 
direct case. The figure includes a tail between 0 and 1, indicating that UniTraX 
is sometimes faster than PINQ or the direct case. As explained before, this 
behavior is due to the database choosing sub-optimal query plans for PINQ and 
the direct case. On average, UniTraX is 1.3x slower than PINQ and 1.8x slower 
than the direct case. In summary, latency overheads introduced by UniTraX are 
moderate. 
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Fig. 6. End-to-end latencies of a 5% sample of the 1213 queries ordered according to 
latencies of direct. The trend in the order of performance is evident. UniTraX is slower 
than PINQ, which is slower than direct. Where UniTraX outperforms the others, the 
database chose a better query plan for UniTraX’s queries. 
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Fig. 7. CDF of relative overheads incurred by UniTraX across all 1213 queries. At 
the 99th-percentile UniTraX is 2.5x slower than PINQ and 3.5x slower than the direct 
case. The initial tail of inverse overhead before 1 consists of 3.2% of queries where the 
database chooses sub-optimal query plans for PINQ and the direct case. 


Size of Budget Tracking State. Figure 8 shows the number of subspaces tracked 
by UniTraX at the beginning of each query. Numbers are again ordered according 
to query latencies in the direct case (see Fig.6). These numbers do not change 
across different runs. The two curves represent two analyst query strategies, one 
with and one without re-balancing. These two curves illustrate that the analyst 
can dramatically affect the size of the budget tracking state based on how queries 
are formulated. 
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Fig. 8. Number of subspaces UniTraX tracks throughout the execution of the queries 
shown in Fig. 6. Reported numbers are obtained at the beginning of each query and do 
not change across different runs. The different curves represent two different analyst 
query strategies, one where the analyst only requests data of interest (w/o RB), and 
one where the analyst requests extra data in order to improve UniTraX’s re-balancing 
(w/ RB). This shows that analysts can substantially reduce the overhead of UniTraX 
through careful selection of query parameters. 
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In the “without re-balancing” strategy (w/o RB), the analyst queries data 
only within a range of interest. For instance, suppose that the analyst is inter- 
ested in a histogram of fares between $0 and $100. The analyst may request 
ten $10 bars. As long as each bar consumes the same budget, UniTraX will 
optimize tracking state and merge the subspaces of these 10 bars into a sin- 
gle subspace. The range above the histogram (above $100), however, cannot be 
merged. As a result, UniTraX stores two subspaces for the fare column. The same 
happens with other columns, with the result that there is a combinatoric explo- 
sion in the number of subspaces because of the combinations of the columns’ 
multiple subspaces. 

In the “with re-balancing” strategy (w/ RB), the analyst instead queries 
data that covers the full range of a column, even though the analyst may not 
be interested in all of that range, or may even know that no data exists in some 
subrange (e.g., no taxi pickups over water). As a result, UniTraX is able to 
merge more subspaces, even those of different columns. At the cost of budget, 
this reduces the number of subspaces substantially, in this case by more than 
an order of magnitude. Re-balancing thus allows analysts to trade-off overheads 
against budget savings. 


7 Related Work 


Due to its age, the area of privacy-preserving data analytics has amassed a vast 
amount of work. The related work section of [16] provides a good overview of 
early work in this space. Around ten years ago Dwork et al. introduced differen- 
tial privacy or DP [6], which quickly developed into a standard for private data 
analytics research (see [7,8]). In this section, we focus on research that investi- 
gates heterogeneous or personalized budgets, tracking of personalized budgets, 
private analytics on dynamic data sets, and PINQ, the system our implementa- 
tion is based on. 

Alaggan et al. [1] propose heterogeneous differential privacy (HDP) to deal 
with user-specific privacy preferences. They allow users to provide a separate pri- 
vacy weight for each individual data field, a granularity finer than that supported 
by UniTraX. However, the total privacy budget is a global parameter. When 
computing a statistical result over the dataset, HDP perturbs each accessed 
data value individually according to its weight and the global privacy budget. 
UniTraX can be extended to support per field rather than per record budgets 
at the cost of additional runtime latency. Further, UniTraX allows analysts to 
query parts of a dataset without consuming the privacy budget of other parts. 
UniTraX also supports a greater set of analytic functions, e.g., median. HDP 
does not provide these capabilities. Queries can only run over the whole dataset 
and, as privacy weights are secret, the exact amount of answer perturbation 
remains unknown to the analyst. 

Jorgensen et al.’s personalized differential privacy (PDP) is a different app- 
roach to the same problem [15]. In contrast to UniTraX, PDP trusts analysts 
and assumes that per-user budgets are public. It tracks the budget globally but 
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manages to avoid being limited to the most restrictive user’s budget by allow- 
ing the analyst to sample the dataset prior to generating any statistical output. 
Depending on the sampling parameters the analyst is able to use more than the 
smallest user budget for a query (but on a subset of records). PDP only supports 
querying the entire dataset at once. Nevertheless, we believe that a combination 
of PDP and UniTraX could be useful, in particular to allow analysts to make 
high budget queries on low budget records. The combination could also do away 
with PDP’s assumption that analysts be trusted. 

In place of personalized privacy protection, Nissim et al. [20] and earlier 
research projects [5,14] provide users different monetary compensation based 
on their individual privacy preferences. It is unclear whether these models can 
be combined with UniTraX as they do not provide any personalized privacy 
protection. Users with a higher valuation receive a higher compensation but 
suffer the same privacy loss as other users. 

Despite allowing users to specify individual privacy preferences, all the above 
systems track budget globally and do not allow analysts to selectively query 
records and consume budget only from the queried records. To the best of our 
knowledge, ProPer [10] is the only system that allows this. We compared exten- 
sively to ProPer in Sect. 2. Our formal model in Sect. 4 is also based on ProPer’s 
formal model. Google’s RAPPOR [11] likewise provides differential privacy guar- 
antees based on user-provided parameters, but the system model is significantly 
different from ours and the privacy guarantee holds only when certain cross- 
query correlations do not occur. In contrast, we (and ProPer) need no such 
assumptions. 

Differential privacy is being increasingly applied to dynamic datasets rather 
than static databases. Since the first consideration of such scenarios in 2010 [9], 
numerous systems have emerged [2—4,13,22,23] that aggregate dynamic data 
streams rather than static datasets in a privacy-preserving manner. UniTraX and 
ProPer can be immediately used for dynamic data streams since their designs 
and privacy proofs already take record addition and deletion into account. 

As explained in Sect.5, our UniTraX implementation is based on the Pri- 
vacy Integrated Queries (PINQ) [17] platform, which offers privacy-preserving 
data analysis capabilities. PINQ, in turn, is based on the Language Integrated 
Queries (LINQ) framework, a well-integrated declarative extension of the NET 
platform. LINQ provides a unified object-oriented data access and query inter- 
face, allowing analysts data access independent of how the data is provided and 
where the answer is finally computed. Data providers can be switched without 
changing code and can be, e.g., local files, remote SQL servers, or even mas- 
sive parallel cluster systems like DryadLINQ [24]. PINQ provides a thin DP 
wrapper over LINQ. For all queries, it ensures that sufficient budget is available 
and that returned answers are appropriately noised. The maximum budget must 
be provided during object initialization. Our implementation uses PINQ in an 
unconventional way—we initialize a new PINQ object prior to every data analy- 
sis, and use PINQ to enforce a stipulated budget. Additionally, we track budget 
consumption on subspaces of the parameter space across queries. 
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8 Conclusion and Future Work 


This paper presented UniTraX, the first differentially private system that sup- 
ports per-record privacy budgets, tells the analyst where (in the parameter space) 
budgets have been used in the past, and allows the analyst to query only those 
points that still have sufficient budget for the analyst’s task. UniTraX attains 
this by tracking budget consumption not on actual records in the database, 
but on points in the parameter space. As a result, information about budget 
consumption reveals nothing about actual records to the analyst. 

We have also presented a formal model of UniTraX and a formal proof that 
UniTraX respects differential privacy for all records. Our prototype implemen- 
tation incurs moderate overheads on a realistic workload. 

There are several directions for future work. First, our implementation is not 
very optimized and there is scope for reducing overheads even further. Second, 
UniTraX can be extended to track budgets at even finer granularity, e.g., a 
budget for every field. Third, one could investigate how queries can be optimized 
to reduce budget consumption. 
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Abstract. Porting a policy from a firewall system to another is a diffi- 
cult and error prone task. Indeed, network administrators have to know in 
detail the policy meaning, as well as the internals of the firewall systems 
and of their languages. Equally difficult is policy maintenance and refac- 
toring, e.g., removing useless or redundant rules. In this paper, we present 
a transcompiling pipeline that automatically tackles both problems: it 
can be used to port a policy into an equivalent one, when the target fire- 
wall language is different from the source one; when the two languages 
coincide, transcompiling supports policy maintenance and refactoring. 
Our transcompiler and its correctness are based on a formal intermedi- 
ate firewall language that we endow with a formal semantics. 


1 Introduction 


Firewalls are one of the standard mechanisms for protecting computer networks. 
Configuring and maintaining them is very difficult also for expert system admin- 
istrators since firewall policy languages are varied and usually rather complex, 
they account for low-level system and network details and support non trivial 
control flow constructs. Additional difficulties come from the way in which pack- 
ets are processed by the network stack of the operating system and further issues 
are due to Network Address Translation (NAT), the mechanism for translating 
addresses and performing port redirection while packets traverse the firewall. 
A configuration is typically composed of a large number of rules and it is often 
hard to figure out the overall firewall behavior. Also, firewall rules interact with 
each other, e.g., some shadow others making them redundant or preventing them 
to be triggered. Often administrators resort to policy refactoring to solve these 
issues and to obtain minimal and clean configurations. Software Defined Network 
(SDN) paradigm has recently been proposed for programming the network as a 
whole at a high level, making network and firewall configuration simpler and less 
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error prone. However, network administrators have still to face the porting of 
firewall configurations from a variety of legacy devices into this new paradigm. 

Both policy refactoring and porting are demanding operations because they 
require system administrators to have a deep knowledge about the policy mean- 
ing, as well as the internals of the firewall systems and of their languages. To 
automatically solve these problems we propose here a transcompiling pipeline 
composed of the following stages: 


1. decompile the policy in the source language into an intermediate language; 

2. extract the meaning of the policy as a set of non overlapping declarative rules 
describing the accepted packets and their translations in logical terms; 

3. compile the declarative rules into the target language. 


Another key contribution of this paper is to formalize this pipeline and to prove 
that it preserves the meaning of the original policy (Theorems 1, 2 and 3). The 
core of our proposal is the intermediate language IFCL (Sect.4), which offers 
all the typical features of firewall languages such as NAT, jumps, invocations to 
rulesets and stateful packet filtering. This language unveils the bipartite struc- 
ture common to real firewall languages: the rulesets determining the destiny of 
packets and the control flow in which the rules are applied. The relevant aspects 
of IFCL are its independence from specific firewall systems and their languages, 
and its formal semantics (Sect.5). Remarkably, stage 1 provides real languages, 
which usually have no formal semantics, with the one inherited by the decom- 
pilation to IFCL. In this way the meaning of a policy is formally defined, so 
allowing algorithmic manipulations that yield the rules of stage 2 (Sect. 6). These 
rules represent minimal configurations in a declarative way, covering all accepted 
packets and their transformations, with neither overlapping nor shadowing rules. 
These two stages are implemented in a tool appearing in a companion paper [1] 
and surveyed below, in the section on related work. The translation algorithm 
of stage 3 (Sect. 7) distributes the rules determined in the previous stage on the 
relevant points of the firewall where it decides the destiny of packets. 

To show our transcompilation at work, we consider iptables [2] and pf [3] 
(Sect. 2), since they have very different packet processing schemes making policy 
porting hard. In particular, we apply the stages of our pipeline to port a policy 
from iptables to pf (Sect.3). For brevity, we do not include an example of 
refactoring, which occurs when the source and the target languages coincide. 


Related Work. Formal methods have been used to model firewalls and access 
control, e.g., [4-6]. Below we restrict our attention to language-based approaches. 

Transcompilation is a well-established technique to address the problem of 
code refactoring, automatic parallelization and porting legacy code to a new 
programming language. Recently, this technique has been largely used in the 
field of web programming to implement high level languages into JavaScript, 
see e.g., [7,8]. We tackle transcompilation in the area of firewall languages to 
support porting and refactoring of policies. 

To the best of our knowledge, the literature has no approaches to mechani- 
cally porting firewall policies, while it has some to refactoring. The proposal in [9] 
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is similar to ours, in that it “cleans” rulesets, then analyzes them by an automatic 
tool. It uses a formal semantics of iptables (without NAT) and a semantics- 
preserving ruleset simplification. The tool FIREMAN [10] detects inconsistencies 
and inefficiencies of firewall policies (without NAT). The Margrave policy ana- 
lyzer [11] analyzes IOS firewalls, and is extensible to other languages. However 
the analysis focuses on finding specific problems in policies rather then synthe- 
sizing a high-level policy specification. Another tool for discovering anomalies 
is Fang [12,13], which also synthesizes an abstract policy. Our approach differs 
from the above proposals mainly because at the same time it (i) is language- 
independent; (ii) defines a formal semantics of firewall behavior; (iii) gives a 
declarative, concise and neat representation of such a behavior; (iv) supports 
NAT; (v) generates policies in a target language. 

Among the papers that formalize the semantics of firewall languages, we 
mention [14,15] that specify abstract filtering policies to be then compiled into 
the actual firewall systems. More generally, NetKat [16] proposes linguistic con- 
structs for programming a network as a whole within the SDN paradigm. All 
these approaches propose their own high level language with a formal semantics, 
and then compile it to a specific target language (cf. our stage 3). Instead, IFCL 
intermediates between real source and target languages. It thus takes from real 
languages actions both for filtering/rewriting packets (notably NAT and MARK) 
and for controlling the inspection flow, widely used in practice. 

Our companion paper [1] describes the design of an automated tool and its 
application to real cases. The tool implements the first two stages of our pipeline 
and supports system administrators in the verification of some properties of a 
given firewall policy. In particular, the user can ask queries to check implication, 
equivalence and difference of policies, and reachability among hosts. The tool 
uses the same syntax of Sect.4 but only sketches how to obtain the declarative 
representation of a given policy, while here we fully formalize the process and 
prove it correct (Sect.6.2). In detail, the present paper partially overlaps with [1] 
on Sect. 4, where the language is presented, and on Sect.6.2, where the logical 
characterization is introduced. Besides the technical details and theorems, which 
support the semantics and the correctness of the whole approach missing in [1], 
here we also address the issue of compiling the declarative firewall representation 
to a target language, enabling transcompilation (cf. Sects.3 and 7). 


2 Background 


Usually, system administrators classify networks into security domains. Through 
firewalls they monitor the traffic and enforce a predetermined set of access control 
policies (packet filtering), possibly performing some network address translation. 

Firewalls are implemented either as proprietary, special devices, or as soft- 
ware tools running on general purpose operating systems. Independently of their 
actual implementations, they are usually characterized by a set of rules that 
determine which packets reach the different subnetworks and hosts, and how 
they are modified or translated. We briefly review iptables [2] and pf [3] that 
are two of the most used firewall tools in Linux and Unix. 
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iptables. It is the default in Linux distributions, and operates on top of Netfil- 
ter, the standard framework for packets processing of the Linux kernel [2]. This 
tool is based on the notions of tables and chains. Intuitively, a table is a collec- 
tion of ordered lists of policy rules called chains. The most commonly used tables 
are: filter for packet filtering; nat for network address translation; mangle for 
packet alteration. There are five built-in chains that are inspected at specific 
moments of the packet life cycle [17]: PreRouting, when the packet reaches the 
host; Forward, when the packet is routed through the host; PostRouting, right 
before the packet leaves the host; Input, when the packet is routed to the host; 
Output, when the packet is generated by the host. Moreover, users can define 
additional chains, besides the built-in ones. 

Each rule specifies a condition and a target. If the packet matches the con- 
dition then it is processed according to the specified target. The most common 
targets are: ACCEPT and DROP, to accept and discard packets; DNAT/SNAT, 
to perform destination/source NAT; MARK to mark a packet with a numeric 
identifier which can be used in the conditions of other rules, even placed in dif- 
ferent chains; RETURN, to stop examining the current chain and resume the 
processing of a previous chain. When the target is a user-defined chain, two 
“jumping” modes are available: call and goto. They differ when a RETURN is 
executed or the end of the chain is reached: the evaluation resumes from the rule 
following the last matched call. Built-in chains have a user-configurable default 
policy (ACCEPT or DROP): if the evaluation reaches the end of a built-in chain 
without matches, its default policy is applied. 


pf. This is the standard firewall of OpenBSD [3] and is included in macOS since 
version 10.7. Similarly to iptables, each rule consists of a predicate which is 
used to select packets and an action that specifies how to process the packets 
satisfying the predicate. The most frequently used actions are pass and block to 
accept and reject packets, rdr and nat to perform destination and source NAT. 
Packet marking is supported also by pf: if a rule containing the tag keyword 
is applied, the packet is marked with the specified identifier and then processed 
according to the rule’s action. 

Differently from other firewalls, the action taken on a packet is determined 
by the last matched rule, unless otherwise specified. pf has a single ruleset that 
is inspected both when the packet enters and exits the host. When a packet 
enters the host, DNAT rules are examined first and filtering is performed after 
the address translation. Similarly when a packet leaves the host: first its source 
address is translated by the relevant SNAT rules, and then the resulting packet 
is possibly filtered. Notice also that packets belonging to established connections 
are accepted by default, thus bypassing the filters. 


3 Porting a Policy: An Example 


Consider the simple, yet realistic network of Fig.1, where the IP addresses 
10.0.0.0/8 identify the private LAN; 54.230.203.0/24 identify servers and produc- 
tion machines in the demilitarized zone DMZ that also hosts the HTTPS server 
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54.230.203.47 


Fig. 1. A network. 


Table 1. Declarative representation of the configuration in Fig. 2. 


Sre IP Src Port|SNAT IP|SNAT Port| DNAT IP|DNAT Port|Dest IP Dest Port|Prot|State 
* * = = = = {54.230.203.47} |443 tcp [NEW 
10.0.0.0/8|* = = = = 54.230.203.0/24 |* * NEW 
10.0.0.0/8|* 23.1.8.15 |- = = *\¢ 80 tcp | NEW 
10.0.0.0/8 443 
54.230. 203.0/24 
127.0.0.0/8 


} 
* * * * * * * * * ESTABLISHED 


with address 54.230.203.47. The firewall has three interfaces: ethO connected to 
the LAN with IP 10.0.0.1, eth1 connected to the DMZ with IP 54.230.203.1 and 
ext connected to the Internet with public IP 23.1.8.15. 

The iptables configuration in Fig.2 enforces the following policy on the 
traffic: (i) hosts from the Internet can connect to the HTTPS server; (ii) LAN 
hosts can freely connect to any host in the DMZ; (iii) LAN hosts can connect 
to the Internet over HTTP and HTTPS (with source NAT). Now, suppose the 
system administrator has to migrate the firewall configuration of Fig.2 from 
iptables to pf. Performing this porting by hand is complex and error prone 
because the administrator has to write the pf configuration from scratch and 
test that it is equivalent to the original one. Furthermore, this requires a deep 
understanding of the policy meaning, as well as of both iptables and pf and 
of their configuration languages. We apply below the stages of our pipeline to 
solve this problem, guaranteeing by construction that the firewall semantics is 
preserved. The next sections detail the following intuitive description. 

First we extract the meaning of the iptables configuration represented by a 
table, in our case Table 1 (stages 1 and 2). For instance, its second row says that 
the packets of a new connection with source address in the range 10.0.0.0/8 (i.e., 
from the LAN) can reach the hosts in the range 54.230.203.0/24 (the DMZ), with 
no NAT, regardless of the protocol and the port. The last row says that packets of 
an already established connection are always allowed. Note that each row in the 
table declaratively describes a set of packets accepted by the firewall, and their 
network translation. Actually, Table 1 is a clean, refactored policy automatically 
generated by the tool of [1]. Indeed, each row is disjoint from the others, so they 
need not to be ordered and none of the typical firewall anomalies arises, like 
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1 *nat 

2  # ACCEPT policy in nat chains 

3 :PREROUTING ACCEPT [0:0] 

4 :INPUT ACCEPT [0:0] 

5  :QUTPUT ACCEPT [0:0] 

6 :POSTROUTING ACCEPT [0:0] 

7 

8 #(iii) Apply SNAT on connections from the LAN towards the Internet 
9 -A POSTROUTING -s 10.0.0.0/8 -o ext -j MASQUERADE 
10 

11 COMMIT 

12 


13 *filter 

14 # DROP policy in filtering chains 
15 :INPUT DROP [0:0] 

16 :FORWARD DROP [0:0] 

17 :OUTPUT DROP [0:0] 


19 # Allow established packets 

20 -A FORWARD -m state --state ESTABLISHED -j ACCEPT 

21 #(ii) LAN hosts can connect to DMZ 

22 -A FORWARD -s 10.0.0.0/8 -a 54.230.203.0/24 -j ACCEPT 

23 #(iii) LAN hosts can connect to the Internet over HTTP/HTTPS 
24 -A FORWARD -s 10.0.0.0/8 -o ext -p tcp --dport 80 -j ACCEPT 

25 -A FORWARD -s 10.0.0.0/8 -o ext -p tcp --dport 443 -j ACCEPT 
26 #(i) Any host can connect to the HTTPS server in the DMZ 

27 -A FORWARD -d 54.230.203.47 -p tcp --dport 443 -j ACCEPT 


29 COMMIT 


Fig. 2. Firewall configuration in iptables. 


nat proto tcp from 10.0.0.0/8 to {!10.0.0.0/8, !54.230.203.0/24, !127.0.0.0/8} 
port {80, 443} tag T1 -> 23.1.8.15 


block all 

pass proto tcp from any to 54.230.203.47 port 443 
pass from 10.0.0.0/8 to 54.230.203.0/24 

pass tagged T1 


NQOoubpwnr 


Fig. 3. The policy in Fig. 2 ported in pf. 


shadowing, rule overlapping, etc. According to stage 3, we compile the refactored 
policy in pf, in two steps. First, the rows are translated in a sequence of IFCL 
rules that are then compiled in pf. The result is in Fig.3 and was computed 
with a proof-of-concept extension of [1] based on the theory presented in Sect. 7. 


4 The Intermediate Firewall Configuration Language 


We now present our intermediate firewall configuration language (IFCL). It is 
parametric w.r.t. the notion of state and the steps performed to elaborate pack- 
ets. For generality, we do not detail the format of network packets. In the follow- 
ing we only use sa(p) and da(p) to denote the source and destination addresses 
of a given packet p; additionally, tag(p) returns the tag m associated with p. An 
address a consists of an IP address ip(a) and possibly a port port(a). An address 
range n is a pair consisting of a set of IP addresses and a set of ports, denoted 
IP(n):port(n). An address a is in the range n (written a € n) if ip(a) € ip(n) 
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and port(a) € port(n), when port(a) is defined, e.g., for ICMP packets we only 
check if the IP address is in the range. 

Firewalls modify packets, e.g., through network address translations. We 
write p|da + a] and p[sa + a] to denote a packet identical to p, except for 
the destination address da and source address sa, which is equal to a, respec- 
tively. Similarly, p[tag ++ m] denotes the packet with a modified tag m. 

Here we consider stateful firewalls that keep track of the state s of network 
connections and use this information to process a packet. Any existing network 
connection can be described by several protocol-specific properties, e.g., source 
and destination addresses or ports, and by the translations to apply. In this way, 
filtering and translation decisions are not only based on administrator-defined 
rules, but also on the information built by previous packets belonging to the 
same connection. We omit a precise definition of a state, but we assume that 
it tracks at least the source and destination ranges, NAT operations and the 
state of the connection, i.e., established or not. When receiving a packet p one 
may check whether it matches the state s or not. We left unspecified the match 
between a packet and the state because it depends on the actual shape of the 
state. When the match succeeds, we write pl, œ, where a describes the actions 
to be carried on p; otherwise we write p Ys- 

A firewall rule is made of two parts: a predicate ¢ expressing criteria over 
packets, and an action t, called target, defining the “destiny” of matching packets. 
Here we consider a core set of actions included in most of the real firewalls. These 
actions not only determine whether or not a packet passes across the firewall, 
but also control the flow in which the rules are applied. They are the following: 


ACCEPT a packet passes 

DROP a packet is discarded 

CALL (R) invoke the ruleset R (see below) 
GOTO(R) jump to the ruleset R 

RETURN exit from the current ruleset 
NAT(Nd, Ns) network translation 

MaRK(772) marking with tag m 
CHECK-STATE(X ) examine the state 


The targets catt(_.) and return implement a procedure-like behavior; coro(_) is sim- 
ilar to unconditional jumps. In the nat action ng and n, are address ranges used 
to translate the destination and source address of a packet, respectively; in the 
following we use the symbol x to denote an identity translation, e.g., n : x means 
that the address is translated according to n, whereas the port is kept unchanged. 
The marx action marks a packet with a tag m. The argument X € {—,—,<} of 
the cuEck-staTeE action denotes the fields of the packets that are rewritten accord- 
ing to the information from the state. More precisely, — rewrites the destination 
address, the source one and + both. Formally: 


Definition 1 (Firewall rule). A firewall rule r is a pair (¢,t) where ¢ is a 
logical formula over a packet, and t is the target action of the rule. 
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A packet p matches a rule r with target t whenever ¢ holds. 


Definition 2 (Rule match). Given a rule r = (¢,t) we say that p matches r 
with target t, denoted p =r t, iff d(p). We write pÆ, when p does not match r. 


We can now define how a packet is processed given a possibly empty list of 
rules (denoted with €), hereafter called ruleset. Similarly to real implementations 
of firewalls, we inspect the rules in the list, one after the other, until we find 
a matching one, which establishes the destiny (or target) of the packet. For 
sanity, we assume that no coro) and caL) occur in the ruleset R, so avoiding 
self-loops. We also assume that rulesets may have a default target denoted by 
ta € {accepr, prop}, which accepts or drops according to the will of the system 
administrator. 


Definition 3 (Ruleset match). Given a ruleset R = [r1,...,7n], we say that 
p matches the i-th rule with target t, denoted pp (t,t), iff 


i< neti = (p t) Apr, t AY] < iph r 


We also write p&r if p matches no rules in R, formally if Vr € R.p r. After- 
words, we will omit the index i when immaterial, and we simply write p =p t. 


In our model we do not explicitly specify the steps performed by the kernel of 
the operating system to process a single packet passing through the host. We 
represent this algorithm through a control diagram, i.e., a graph where nodes 
represent different processing steps and the arcs determine the sequence of steps. 
The arcs are labeled with a predicate describing the requirements a packet has 
to meet in order to pass to the next processing phase. Therefore, they are not 
finite state auomata. We assume that control diagrams are deterministic, i.e., 
that every pair of arcs leaving the same node has mutually exclusive predicates. 
For generality, we let these predicates abstract, since they depend on the specific 
firewall. 


Definition 4 (Control diagram). Let Ù be a set of predicates over packets. 
A control diagram C is a tuple (Q, A, qi, qf), where 


- Q is the set of nodes; 
-~-ACQxWxQ is the set of arcs, such that whenever (q, Y, q’), (q4, Y, g) EA 


and q # q” then ~(p Ay"); 
- qi;qf E Q are special nodes denoting the start and the end of elaboration. 


The firewall filters and possibly translates a given packet by traversing a control 
diagram accordingly to the following transition function. 


Definition 5 (Transition function). Let (Q,A,qi,qf) be a control diagram 
and let p be a packet. The transition function 6: Q x Packet +> Q is defined as 


ôl) =q iff lq, Y, q) € A. (p) holds. 


We can now define a firewall in IFCL. 


Definition 6 (Firewall). A firewall F is a triple (C,p,c), where C is a control 
diagram; p is a set of rulesets; and c: Q |> p is the correspondence mapping 
from the nodes of C to the actual rulesets. 
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Fig. 4. The control diagram of iptables 


4.1 Decompiling Two Real Languages into IFCL 


Here we encode the two de facto standard Unix firewalls iptables and pf as 
triples (C,p,c) of our framework (stage 1). An immediate fallout is a formal 
semantics for both iptables and pf defined in terms of that of IFCL (see Sect. 5). 


Modelling iptables. Let L be the set of local addresses of a host; and let %1 
and wv, predicates over packets defined as follows: 


v1(p) = sa(p) E £ W2(p) = da(p) € £. 


Figure 4 shows the control diagram C of iptables, where unlabeled arcs carry 
the label “true.” It also implicitly defines the transition function according to 
Definition 5. In iptables there are twelve built-in chains, each of which corre- 
spond to a single ruleset. So we can define the set pp a p of primitive rulesets 


man nat fil man nat man nat man 
as the one made of Rixe ’ Rixe» Rixps ROG Rote ae Rpke?» Rete RFor> 


RE REA" and Rf, where the superscript represents the chain name and 
the subscript the table name. Note that the set p\p, contains the user-defined 
chains. 

The mapping function c: Q + p is defined as follows: 


c(qf 


qi) = j= =e 
oe . c(Inp™) = R a c(Fwd!) = RE, 
c(Inp") = RIE c(Inp!) = Ri, c(Out™) = REER 
c(Out”) = RG (Out) = RE. c(Fud™) = Rear 
c(Fwd!) = Rf, c(Post™) = RRE c(Post”) = RR, 


where R is an empty ruleset with accept as default policy. 
Finally, note that the action cart) implements the built in target sumpc). 


Modelling pf. Differently from iptables, pf has a single ruleset and the rule 
applied to a packet is the last one matched, apart from the case of the so-called 
quick rules: as soon as one of these rules matches the packet, its action is applied 
and the remaining part of the ruleset is skipped. 

Figure 5 shows the control diagram Cpp for pf that also defines the transition 
function. The nodes Inp” and Inp/ represent the procedure executed when an 
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Fig. 5. The control diagram of pf 


IP packet reaches the host from the net. Dually, Out” and Out? are for when the 
packet leaves the host. The predicates pı and w»2 are those defined for iptables. 
Given the pf ruleset Rpt we include the following rulesets in pps: 


— Ranat contains the rule (state == 1, cuecx-state(—)) as the first one, followed 
by all the rules rdr of Rpt; 
— Renat contains the rule (state == 1, cuecx-stare(<—)) as the first one, followed 


by all the rules nat of Rpt; 

— Rfinp contains the rule (state == 1, accept) followed by all the quick filtering 
rules of Rpt without modifier out, and finally the rule (true, cora(R finpr)); 

— Rfinpr contains all the no quick filtering rules of Rp without modifier out, 
in reverse order; 

— Rfout contains the rule (state == 1, accept) followed by all the quick filtering 
rules of Ry¢ without modifier in, and (true, cota Rfoutr)) as last rule; 

— Rfoutr includes all the no quick filtering rules of Rpt without modifier in in 
reverse order. 


Given the ruleset R with the only rule for accepr as default policy, the mapping 
function Cpr is defined as follows: 


Cyf (Gi) = R Cp (Inp”) = Ranat Cpf( Out”) = Renat 
cpf las) =R Cpp Inp!) = Rfinp Cpp (Out!) = Ryout 


5 Formal Semantics 


Now, we formally define the semantics of a firewall through two transition sys- 
tems operating in a master-slave fashion. The master has a labeled transition 


relation of the form s 5 s’. The intuition is that the state s of a firewall 


changes to s’ when a new packet p reaches the host and becomes p’. 

The configurations of the slave transition system are triples (q, s,p) where: 
(i) q E€ Q is a control diagram node; (ii) s is the state of the firewall; (iv) p is 
the packet. A transition (q, s, p) — (q', s, p') describes how a firewall in a state s 
deals with a packet p and possibly transforms it in p’, according to the control 
diagram C. Recall that the state records established connections and other kinds 
of information that are updated after the transition. 
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In the slave transition relation, we use the following predicate, which 
describes an algorithm that runs a ruleset R on a packet p in the state s 


p,s FB (t,7’) 


This predicate searches for a rule in R matching the packet p through p Fp (t, i). 
If it finds a match with target t, t is applied to p to obtain a new packet p’. 

Recall that actions cCALL(R), RETURN and coro) are similar to procedure calls, 
returns and jumps in imperative programming languages. To correctly deal with 
them, our predicate p, s = (t, p’) uses a stack S to implement a behavior similar 
to the one of procedure calls. We will denote with e the empty stack and with - 
the concatenation of elements on the stack. This stack is also used to detect and 
prevent loops in ruleset invocation, as it is the case in real firewalls. 

In the stack S we overline a ruleset R to indicate that it was pushed by 
a coTo(.) action and it has to be skipped when returning. Indeed, we use the 
following pop* function in the semantics of the return action: 


pop*(e)=e€  pop*(R-S)=(R,S)  pop*(R- S) = pop*(S) 


In case there is a non-overlined ruleset on the top of S, it behaves as a standard 
pop operation; otherwise it extracts the first non-overlined ruleset. When S is 
empty, we assume that pop* returns e€ to signal the error. 

Furthermore, in the definition of p, s = (t,p’) we use the notation Rẹ to 
indicate the ruleset [rx,..., rn] (k € [1,n]) resulting from dropping the first k — 1 
rules from the given ruleset R = [ri,..., rn]. 

We also assume the function establ that, taken an action a from the state, 
a packet p and the fields X € {—,—,<>} to rewrite, returns a possibly changed 
packet p’, e.g., in case of an established connection. Also this function depends 
on the specific firewall we are modeling, and so it is left unspecified. 

Finally, we assume as given a function nat(p, s, dn, Sn) that returns the packet 
p translated under the corresponding NAT operation in the state s. The argument 
dn is used to modify the destination range of p, i.e., destination NAT (DNAT), 
while s„ is used to modify the source range, i.e., source NAT (DNAT). Recall 
that a range of the form x : x is interpreted as the identity translation, whereas 
one of the form a : x modifies only the address. Also this function is left abstract. 

Table 2 shows the rules defining p, s } (t, p'). The first inference rule deals 
with the case when the packet p matches a rule that says accept or prop; in this case 
the ruleset execution stops returning the found action and leaving p unmodified. 
When a packet p matches a rule with action cuecx-statz, we query the state s: 
if p belongs to an established connection, we return accept and a p’ obtained 
rewriting p. If p belongs to no existent connection the packet is matched against 
the remaining rules in the ruleset. When a packet p matches a nat rule, we return 
accept and the packet resulting by the invocation of the function nat. There are 
two cases if a packet p matches a cotoc_). If the ruleset R’ is not already in the 
stack, we push the current ruleset R onto the stack overlined to record that this 
ruleset dictated a coro). Otherwise, if R’ is in the stack, we detect a loop and 
discard p. The case when a packet p matches a rule with action caLi(_.) is similar, 
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Table 2. The predicate p, s =$ (t, p’). 


PER (t,i) t € {ACCEPT, DROP} p |r (CHECK-STATE(X), i) pHs a _ p’ = establ(a, X, p) 


(2) 


a) 


p, s |, (ACCEPT, p’) 


p, s FR (t; p) 


p [=r (CHECK-STATE(X), i) pis p.s =R, (t,P’) ta) pr (NAT(dn, Sn), 4) 
ps ER (t, p") p, s HR (ACCEPT, nat(p, s, dn, 8n)) 


(3) 


pr (GoTo(R’),i) R'¢S p,sK®,5 (t,p') is pier (GOTO(R’),i) R’ ES 
p, s =È (DROP, p) 


(5) 


ps FR (tp) 


pnr (CALL), i) RES ps HEITU? (t,p') plEr (CALL(R’),i) R'E S 


(7) (8) 


S S 
ps ER (t p) p, s |h (DROP, p) 


pr (RETURN, i) pop” (S) = (R’,S’) p,s =$, (t,p') p |=r (RETURN, i) pop*(S) =e 


(10) 


(9) 


p,s ER (t p’) p, s ER (ta, p) 


pr S#e pop*(S)=(R',S') p,s E%, (t p') pr (S=e V pop*(S)= €) 
sS ld (12) S 
p, s ER (t,p’) P, 5S ER (ta, p) 


(11) 


En (WARK(m),i)  pltag r+ m], s HŠ, 6p”) 


a3) * 


ps ER (tp) 


except that the ruleset pushed on the stack is not overlined. When a packet p 
matches a rule with action return, we pop the stack and match p against the 
top of the stack. Finally, when no rule matches, an implicit return occurs: we 
continue from the top of the stack, if non empty. The marx rule simply changes 
the tag of the matching packet to the value m. If none of the above applies, we 
return the default action tg of the current ruleset. 

We can now define the slave transition relation as follows. 


c(q) =F p,s Ff (accerr,p’) Sla p) = q 


(9,8,p) > (7, 8,P’) 
The rule describes how we process the packet p when the firewall is in the 
elaboration step represented by the node q with a state s. We match p against 
the ruleset R associated with q and if p is accepted as p’, we continue considering 
the next step of the firewall execution represented by the node q/. 
Finally, we define the master transition relation that transforms states and 
packets as follows (as usual, below —* stands for the transitive closure of —): 


(qi, s, p) at (qf, s,p') 


s = sW (p, p") 


This rule says that when the firewall is in the state s and receives a packet p, 
it elaborates p starting from the initial node q; of its control diagram. If this 
elaboration succeeds, i.e., it reaches the node qs accepts p as p', we update the 
state s by storing information about p, its translation p’ and the connection they 
belong to, through the function W, left unspecified for the sake of generality. 
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Example 1. Suppose to have the user-defined chains below 


Chain Cp Chain u1 Chain u2 
(1, DROP) (Q11, ACCEPT) (Ø21, ACCEPT) 
(bg, CALL(u1)) (Q12, CALLCu2)) (bog, RETURN) 
(#3, accept) (d13, DROP) ($23, DROP) 


and that the condition 7=¢; A ¢@2 A ¢1, holds for a packet p. Then, the semantic 
rules (a), (b) and (c) are applied in order: 


PFcp (CALL), i) wg S p,s Les’ (accept, p) 


(a) 


Pp, 5 Fo, (accert, p) 


() p Fu, (accerr, 1) c(q)=CB p,s Fo, (acort, p) ôlq, p) =g 
7 E 
p, s HP® (accer, p) (q,s,p) > (q',8,p) 


6 From Operational to Declarative Descriptions 


We now extract the meaning of a firewall written in our intermediate language by 
transforming it in a declarative, logical presentation that preserves the semantics 
(stage 2). This transformation is done in three steps: (i) generate an unfolded fire- 
wall with a single ruleset for each node of the control diagram; (ii)transform the 
unfolded firewall in a first-order formula; (iii)determine a model for the obtained 
formula, through a SAT solver (the procedure for this step is described in [1] 
and is omitted here). The correctness of stage 2 follows from Theorem 1, which 
guarantees that the unfolded firewall is semantically equivalent to the original 
one, and from Theorem 2, which ensures that the derived formula characterizes 
exactly the accepted packets and their translations. 


6.1 Unfolding Chains 


Our intermediate language can deal with involved control flows, by using the tar- 

gets coTo(.), caLL(.) and return (see Example 1). The following unfolding operation 

[-] rewrites a ruleset into an equivalent one with no control flow rules. 
Hereafter, let r; R be a non empty ruleset consisting of a rule r followed by 

a possibly empty ruleset R; and let Rj@Rp be the concatenation of Rı and Rə. 
The unfolding of a ruleset R is defined as follows: 


[R] = [A] fry 
[7 = € 
[(¢, t); Ry = (fA 4¢,t); [Rt if t Z {coroc’), caLL(R’), RETURN} 
[(¢, Return); R]} = [Rii*? 
[RI fR CR] if Rg I 
(f A @, prop); [R] f otherwise 


[(ġ, carey); RIF = 
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fA fA f£ pr 

[(9, corme»); RIF = ee, P1 

(f A @, wmo); LR]; ? otherwise 
The auxiliary procedure [R] recursively inspects the ruleset R. The formula f 
accumulates conjuncts of the predicate ¢; the set I records the rulesets traversed 
by the procedure and helps detecting loops. If a rule does not affect control flow, 
we just substitute the conjunction f A ¢ for ¢, and continue to analyze the rest 
of the ruleset with the recursive call [R]/. 

In the case of a return rule (@, RETURN) we generate no new rule, and we continue 
to recursively analyze the rest of the ruleset, by updating f with the negation of 
@. For the rule (¢, cattcr’)) we have two cases: if the callee ruleset R’ is not in J, 
we replace the rule with the unfolding of R’ with f A ¢ as predicate, and append 
{R’} to the traversed rulesets. If R’ is already in J, i.e., we have a loop, we replace 
the rule with a prop, with f \¢ as predicate. In both cases, we continue unfolding 
the rest of the ruleset. We deal with the rule (¢,coro.r’)) as the previous one, 
except that the rest of the ruleset has f A 7=@ as predicate. 


Example 2. Back to Example 1, unfolding the chain Cg gives the following rules: 


[Cz] = (¢1, prop); 
(do A 11, accePt); 
(p2 A Q12 A p21, ACCEPT); 
(G2 A Q12 A 722 A $23, DROP); 
($2 A $13, DROP); 
( 
€ 


P3, ACCEPT); 


We just illustrate the first three steps: 


[CB] =[(¢1, po); Cal (os = (1, prop); [(@2, caLL cu) ); Cashes 
{CB} {CB} 


t A 
=p Nua OlCasl os} 

Note that our transformation does not change the set of accepted packets, e.g., 
all packets satisfying =@1 A ¢2 A $1, are still accepted by the unfolded ruleset. 


An unfolded firewall is obtained by repeatedly rewriting the rulesets associated 
with the nodes of its control diagram, using the procedure above. Formally, 


Definition 7 (Unfolded firewall). Given a firewall F = (C,p,c), its unfolded 
version |F] is (C, p’,c) where Yq € C.c (q) = [c(q)] and pP = {[c(q@)] |q € C}. 


We now prove that a firewall F and its unfolded version [F] are semantically 
equivalent, i.e., they perform the same action over a given packet p in a state s, 
and reach the same state s’. Formally, the following theorem holds: 
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Table 3. Translation of rulesets into logical predicates. 


P.(p, p) = dp(R) ^p =p 
P,:R(p, P) = ($(p) ^ p = B) V (“4(p) A Pr(p, B)) if r = (, ACCEPT) 
P,r(p, P) = 79(p) ^ Pr(p, P) if r = (ġ, DROP) 
P,;r(p, B) = ($(p) A B E tr(p, dn, Sn, )) V (“¢(p) ^A Pr(p,p)) ifr = ($, nat(dn, Sn)) 
P,:r(p, B) = (b(p) Ap E tr(p, *:x,*:x, X) V ({ġ(p) A Pr(p,p)) ifr = (¢, cHEcK-sTaTE(X )) 
( 


P,:R(p,P) = ($(p) A Pr(pltag => m], P)) V (C4(p) A Pr(p,B)) if r = (¢, marK(m)) 


Theorem 1 (Correctness of unfolding). Let F = (C,p,c) be a firewall and 


[F] its unfolding. Let s PP, s’ be a step of the master transition system 
performed by the firewall X € {F,]F]}. Then, it holds 


y 1 
Pp Dp 
s =r Ss > s— 17 s- 


6.2 Logical Characterization of Firewalls 


We construct a logical predicate that characterizes all the packets accepted by 
an unfolded ruleset, together with the relevant translations. 

To deal with NAT, we define an auxiliary function tr that computes the set of 
packets resulting from all possible translations of a given packet p. The parameter 
X € {—,—,<} specifies if the translation applies to source, destination or both 
addresses, respectively, similarly to cueck-state(X). 


ee ee ee E 
tr(p, dn, Sn, —) = {p|da te aa] | aa E dn} 
tr(p, dn, Sn, —) £ {p[sa — as] |as € Sn} 


Furthermore, we model the default policy of a ruleset R with the predicate dp, 
true when the policy is accept, false otherwise. 

Given an unfolded ruleset R, we build the predicate Pr(p,p) that holds when 
the packet p is accepted as p by R. Its definition is in Table3 that induces on 
the rules in R. Intuitively, the empty ruleset applies the default policy dp(R) 
and does not transform the packet, encoded by the constraint p = p. The rule 
(@, accept) considers two cases: when ¢(p) holds and the packet is accepted as it is; 
when instead =¢(p) holds, p is accepted as p only if the continuation R accepts it. 
The rule (¢, prop) accepts p only if the continuation does and ¢(p) does not hold. 
The rule (¢,nat(dn, Sn)) is like an (¢, accept): the difference is when ¢(p) holds, 
and it gives p by applying to p the NAT translations tr(p, dn, 8n,<?). Finally, 
(@, cueckx-state(X)) is like a NAT that applies all possible translations of kind X 
(written as tr(p, *:*, *:x, X)). The idea is that, since we abstract away from the 
actual established connections, we over-approximate the state by considering 
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any possible translations. At run-time, only the connections corresponding to 
the actual state will be possible. The rule (¢, marx(m)) is like a NAT, but when 
o(p) holds it requires that the continuation accepts p tagged by m as P. 


Example 3. The predicate of the unfolded ruleset in Example 2 when 
dp(Cg) = F is 
Pics] (P,P) = 71 A ( 
(62 A Q11 Ap = P) V (7(62 A G11) A ( 
(2 A Q12 A Q21 A p = P) V (7(¢2 A hi2 A G21) A ( 
(G2 A h12 A G22 A 28) A ( 
a(¢2 A Q13) A ( 
(63 Ap = B) V (“3 A ( 


F \p=p))))))))) 


Note that if =@; A ¢2 A 11 holds then the formula trivially holds and therefore 
the formula accepts the packet as the semantics does. 

As a further example, consider the case in which ¢2, $12, 622, 623, 63 hold for 
a packet p, while all the other ¢’s does not. Then, p is accepted as it is: the 
rule (93, prop) is not evaluated since ¢22 holds and the return is performed (cf. 
Example 1). Indeed, the predicate Pic,)(p, p) evaluates to: 


TA(BPV(TA(FV(TA(TA(LTA(LYV (FA F)))))))) =T 
Instead, if @13 holds too, the packet is rejected as expected: 
TA(PV(TA(BV(TA(TA(FA(TV (FA F)))))))) =F 


The predicate in Table 3 is semantically correct, because if a packet p is accepted 
by a ruleset R as p’, then Pr(p,p’) holds, and vice versa. Formally, 


Lemma 1. Given a ruleset R we have that 


1. Vp, s. p, 8 |h (accept, p') => Pr(p,p’); and 
2. Vp, p'. Pr(p,p') => As.p,s |k (acczpr, p’) 


We eventually define the predicate associated with a whole firewall as follows. 


Definition 8. Let F = (C,p,c) be a firewall with control diagram C = (Q, A, 
qi;qf). The predicate associated with F is defined as 


Pr(p,p) = P! (p,p) where 


Pa (2.8) ê p =P Pi B) È Ap’ Paola] V LAPO D) 
(apg )EA 
q' I 
for all q E€ Q such that q £ qf, and where Poq) is the predicate constructed from 
the ruleset associated with the node q of the control diagram. 
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Intuitively, in the final node qf we accept p as it is. In all the other nodes, p is 
accepted as p if and only if there is a path starting from p in the control diagram 
that obtains p through intermediate transformations. More precisely, we look for 
an intermediate packet p’, provided that (i) p is accepted as p’ by the ruleset 
c(q) of node q; (ii) p’ satisfies one of the predicates y labeling the branches of 
the control diagram; and (iii) p’ is accepted as p in the reached node q’. Note 
that we ignore paths with loops, because firewalls have mechanisms to detect 
and discard a packet when its elaboration loops. To this aim, our predicate uses 
the set I for recording the nodes already traversed. 

We conclude this section by establishing the correspondence between the log- 
ical formulation and the operational semantics of a firewall. Formally, F accepts 
the packet p as p if the predicate Pz (p, P) is satisfied, and vice versa: 


Theorem 2 (Correctness of the logical characterization). Given a fire- 
wall F = (C,p,c) and its corresponding predicate PF we have that 


1.587") sw (p,p') => Pr(p,p’) 


2. Yp, p'. Pr(p,p') => As.s ==> sw (p,p’) 


Recall that the logical characterization abstracts away the notion of state, and 
thus P¢(p, p’) holds if and only if there exists a state s in which p is accepted as p’. 
In particular, if the predicate holds for a packet p that belongs to an established 
connection, p will be accepted only if the relevant state is reached at runtime. 
This is the usual interpretation of firewall rules for established connections. 


7 Policy Generation 


The declarative specification extracted from a firewall policy (cf. Table 1) can be 
mapped to a firewall Fs whose control diagram has just one node. The ruleset Rs 
associated with this node only contains accepr and nat rules, each corresponding 
to a line of the declarative specification. In Sect.3 we showed that each line is 
disjoint from the others. Hence, the ordering of rules in Rg is irrelevant. 

Here we compile Fg into an equivalent firewall Fo. First, we introduce an 
algorithm that computes the basic rulesets of Fo. Then, we map these rulesets 
to the nodes of the control diagram of a real system. Finally, we prove the 
correctness of the compilation. 

For simplicity, we produce a firewall that automatically accepts all the packets 
that belong to established connections with the appropriate translations. We 
claim this is not a limitation, since it is the default behavior of some real firewall 
systems (e.g., pf) and it is quite odd to drop packets, once the initial connection 
has been established. Moreover, this is consistent with the over-approximation 
on the firewall state done in Sect. 6.2. 
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Algorithm 1. Generation of the rulesets Ranat, Rfi, Renat, Rmark from Rg 
1: Ranat Rei Rsnat Rmark E 

2: for r in Rs do 

3: if r = (¢, accept) then 


4: add r to Rei 

5: else if r = (ġ,NAT(dn, Sn)) then 

6: generate fresh tag m 

7: add (¢ A tag(p) = è, Mark(m)) to Rmark 
8: add (tag(p) = m,wat(dn, *)) to Ranat 
9: add (tag(p) = M, NAT(*, Sn)) to Rsnat 
10: end if 

11: end for 


12: add (tag(p) Æ ©, accept) and (true, prop) to Rfi 
13: prepend Rmark to Ranat, Rju and Rsnat 


7.1 Compiling a Firewall Specification 


Our algorithm takes as input the ruleset Rg derived from a synthesized spec- 
ification and yields the rulesets Rez, Ranat, Rsnat (with default accept policy) 
containing filtering, DNAT and SNAT rules. This separation reflects that all the 
real systems we have analyzed impose constraints on where NAT rules can be 
placed, e.g., in iptables, DNAT is allowed only in rulesets R894 and RZ*"., while 
SNAT only in RR and Rpt... 

Intuitively, Algorithm 1 produces rules that assign different tags to packets 
that must be processed by different NAT rules (lines 6 and 7). Each NAT rule is 
split in a DNAT (line 8) and an SNAT (line 9), where the predicate ¢ becomes a 
check on the tag of the packet. Filtering rules are left unchanged (line 4). Packets 
subject to NAT are accepted in Ry, while the others are dropped (line 12). We 
prepend Rmark to all rulesets making sure that packets are always marked, 
independently of which ruleset will be processed first (line 13). We use e to 
denote the empty tag used when a packet has never been tagged. 

Recall that the Q operator combines rulesets in sequence. Note that Ri 
drops by default and shadows any ruleset appended to it. In practice, the only 
interesting rulesets are R = {Re, Rit, Ranat; Renat: Ranat @ Rit, Rsnat Q Rfu} 
where Re is the empty ruleset with default accepr policy. Since here we do not 
discuss ipfw |18] and other firewalls with a minimal control diagram, we neither 
use Ranat @ Rei nor Rinat Q Ryu: 

We now introduce the notion of compiled firewall. 


Definition 9 (Compiled firewall). A firewall Fo = (C, p,c) with control dia- 
gram C = (Q, A, qi,qf) is a compiled firewall if 


c(qi) = c(qp) = Re 
- c(q) € R for every q E Q \ {a ar} 
— every path n from qi to qs in the control diagram C traverses a node q such 


that c(q) € {Rfu, Ranat @ Rfi, Renat @ Rei} 
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Intuitively, the above definition requires that only rulesets in R are associated 
with the nodes in the control diagram and that all paths pass at least one through 
a node with the filtering ruleset. 


Example 4. Now we map the rulesets to the nodes of the control diagrams of 
the real systems presented in Sect. 4.1. For iptables we have: 


c(Pre”) = Ranat Cc(Out”) = Ranat C(Inp”) = Rsnat C( Post”) = Renat 
c( Fwd!) = Ryu c(Inp’) = Rei c( Out’) = Ryu 
while the remaining nodes get the empty ruleset Re. For pf we have: 


c(Inp”) = Ranat c( Out”) = Renat c(Inp’) = Rei c( Out”) = Ryu 


7.2 Correctness of the Compiled Firewall 


We start by showing that a compiled firewall Fo accepts the same packets as 
Fs, possibly with a different translation. 


Lemma 2. Let Fo be a compiled firewall. Given a packet p, we have that 


Ap’. Prs (p,p) = Sp”. PFolp, p”). 


Let be T = {id,dnat, snat, nat} the set of translations possibly applied to a 
packet while it traverses a firewall. The first, id, represents the identity, dnat 
and snat are for DNAT and SNAT, while nat represents both DNAT and SNAT. 
Also, let (J, <) be the partial order such that id < dnat, id < snat, dnat < nat 
and snat < nat. Finally, given a packet p and a firewall F, let 7¢(p) be the path 
in the control diagram of F along which p is processed. Note that there exists a 
unique path for each packet because the control diagram is deterministic. 

The following function computes the translation capability of a path 7, i.e., 
which translations can be performed on packets processed along m. 


Definition 10 (Translation capability). Let 7 = (q1,..-, dn) be a path on the 
control diagram of a compiled firewall F = (C,p,c). The translation capability 


of 7 is 
te(r) = lub ( U vet) 


qi ET 


where lub is the least upper bound of a set T CT w.r.t. < and y is defined as 
qy(R) = {id} for R € {Re, Rp} 


(Rt) = {t} for t € {dnat, snat} 
7(Ri @ R2) = (Pi) U (Rə) 
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We write p ~ p’ to denote that p’ = p[tag +> m] for some marking m. In addition, 
let tg be a function that, given a packet p and its translation p’, computes a 
packet p” where only the translation @ € T is applied to p, defined as: 


tia(p, p) =p tanat(p,p') = plda + da(p’)] 
tnat p, P = p' tsnat(p, p) = p[sa > sa(p’)] 


The following theorem describes the relationship between a compiled firewall Fo 
and the firewall Fs. Intuitively, Fs accepts a packet p as p’ if and only if Fo 
accepts a packet p as p” where p’ and p” only differ on marking and NAT. More 
specifically, p” is derived from p by applying all the translations available on the 
path mFo(p) in the control diagram of Fc, along which p is processed. 


Theorem 3. Let p,p’ be two packets such that p is accepted by both Fs and Fo. 
Moreover, let p” ~ tg(p,p’) where B = tc(mF,(p)). We have that 


Prs(p, p) + Prolp, p”). 


Example 5. Consider again Example 4. Any path m in iptables has tc(7) = 
nat, which implies p’ ~ p”, i.e., Fo behaves exactly as Fg. Interestingly, paths 
mı = (qi, Inp”, Inp?, qo) and T2 = (qi, Out”, Out” ,qo) in pf have te(r) equal to 
dnat and snat, respectively. In fact, pf cannot perform snat and dnat on packets 
directed to and generated from the host, respectively. 


8 Conclusions 


We have proposed a transcompling pipeline for firewall languages, made of three 
stages. Its core is IFCL, an intermediate language equipped here with a formal 
semantics. It has the typical actions of real configuration languages, and it keeps 
them apart from the way the firewall applies them, represented by a control 
diagram. In stage 1, a real firewall policy language can be encoded in IFCL by 
simply instantiating the state and the control diagram. As a by-product, we give 
a formal semantics to the source language, which usually has none. In stage 2, 
we have built a logical predicate that describes the flow of packets accepted by 
the firewall together with their possible translations. From that, we have synthe- 
sized a declarative firewall specification, in the form of a table that succinctly 
represents the firewall behavior. This table is the basis for supporting policy 
analysis, like policy implication and comparison, as described in our companion 
paper [1]. The declarative specification is the input of stage 3, which compiles 
it to a real target language. To illustrate, we have applied these stages on two 
among the most used firewall systems in Linux and Unix: iptables and pf. 
We have selected these two systems because they exhibit very different packet 
processing schemes, making the porting of configurations very challenging. All 
the stages above have been proved to preserve the semantics of the original pol- 
icy, so guaranteeing that our transcompilation is correct. As a matter of fact, 
we have proposed a way to mechanically implement policy refactoring, when 
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the source and the target languages coincide. This is because the declarative 
specification has no anomalies, e.g., rule overlapping or shadowing, so helping 
the system administrator also in policy maintenance. At the same time, we have 
put forward a manner to mechanically port policies from one firewall system 
to another, when their languages differ. We point out that, even though [1] 
intuitively presents and implements the first two stages of our transcompiling 
pipeline, the overlap with this paper is only on Sects. 4 and 6.2. Indeed, the the- 
ory, the semantics, the compilation of stage 3 and the proofs of the correctness 
of the whole transcompilation are original material. 

As a future work, we intend to further experiment on our proposal by encod- 
ing more languages, e.g., from specialized firewall devices, like commercial Cisco 
IOS, or within the SDN paradigm. We plan to include a (more refined) policy 
generator of stage 3 in the existing tool [1] that implements the stages 1 and 2, 
and can deal with configurations made of hundreds of rules. Also testing and 
improving the performance of our transcompiler, as well as providing it with a 
friendly interface would make it more appealing to network administrators. For 
example, readability can be improved by automatically grouping rules and by 
adding comments that explain the meaning of refactored configurations. Finally, 
it would be very interesting to extend our approach to deal with networks with 
more than one firewall. The idea would be to combine the synthesized specifica- 
tions based on network topology and routing. 
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Abstract. Ensuring security of complex systems is a difficult task that 
requires utilization of numerous tools originating from various domains. 
Among those tools we find attack-defense trees, a simple yet practical 
model for analysis of scenarios involving two competing parties. Enhanc- 
ing the well-established model of attack trees, attack-defense trees are 
trees with labeled nodes, offering an intuitive representation of possible 
ways in which an attacker can harm a system, and means of countering 
the attacks that are available to the defender. The growing palette of 
methods for quantitative analysis of attack-defense trees provides secu- 
rity experts with tools for determining the most threatening attacks 
and the best ways of securing the system against those attacks. Unfor- 
tunately, many of those methods might fail or provide the user with 
distorted results if the underlying attack-defense tree contains multiple 
nodes bearing the same label. We address this issue by studying condi- 
tions ensuring that the standard bottom-up evaluation method for quan- 
tifying attack-defense trees yields meaningful results in the presence of 
repeated labels. For the case when those conditions are not satisfied, we 
devise an alternative approach for quantification of attacks. 


1 Introduction 


Beginning with 19th century chemistry and a groundbreaking work of Cayley, 
who used them for the purposes of enumeration of isomers, trees — connected 
acyclic graphs — have a long history of application to various domains. Those 
include safety analysis of systems using the model of fault trees [10], developed 
in 1960s, and security analysis with the assistance of the attack trees, which 
the fault trees inspired. Attack trees were introduced by Schneier in [26], for 
the purpose of analyzing security of systems and organizations. Seemingly sim- 
ple, attack trees offer a compromise between expressiveness and usability, which 
not only makes them applicable for industrial purposes [23], but also puts them 
at the core of many more complex models and languages [11,24]. An exten- 
sive overview and comparison of attack tree-based graphical models for security 
can be found in [20]. A survey focusing on scalability, complexity analysis and 
practical usability of such models has recently been provided in [12]. 
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Attack-defense trees [18] are one of the most well-studied extensions of attack 
trees, with new methods of their analysis developed yearly [2,3,8,21]. Attack- 
defense trees enhance attack trees with nodes labeled with goals of a defender, 
thus enabling modeling of interactions between the two competing actors. They 
have been used to evaluate the security of real-life systems, such as ATMs [7], 
RFID managed warehouses [4] and cyber-physical systems [16]. Both the theo- 
retical developments and the practical studies have proven that attack—defense 
trees offer a promising methodology for security evaluation, but they also high- 
lighted room for improvements. The objective of the current paper is to address 
the problem of quantitative analysis of attack—defense trees with repeated labels. 


Related Work. It is well-know that the analysis of an attack—defense tree becomes 
more difficult if the tree contains repeated labels. This difficulty is sometimes 
recognized, e.g., in [2,21], where authors explicitly assume lack of repeated labels 
in order for their methods to be valid. In some works the problem is avoided (or 
overlooked) by interpretation of repeated labels as distinct instances of the same 
goal, thus, de facto as distinct goals (e.g., [8,13,18,22]), or by distinguishing 
between the repetitions occurring in specific subtrees of a tree, as in [3]. Recently, 
Bossuat and Kordy have established a classification of repeated labels in attack— 
defense trees, depending on whether the corresponding nodes represent exactly 
the same instance or different instances of a goal [5]. They point out that, if the 
meaning of repeated labels is not properly specified, then the fast, bottom-up 
method for identifying attacks that optimize an attribute (e.g., minimal cost, 
probability of success, etc.), as used in [15,18,22], might yield tainted results. 

Repeated labels are also problematic in other tree-based models, for instance 
fault trees. Whereas some methods for qualitative analysis of fault trees with 
repeated basic events (or generally, shared subtrees) have been developed [6, 
27], their quantification might rely on approximate methods. For example, the 
probability of a system failure can be evaluated using rare event approximation 
approach (see [10], Chap. XI), while a simple bottom-up procedure gives an exact 
result in fault trees with no shared subtrees [1]. This last observation is consistent 
with the results previously obtained for attack—defense trees (see Theorems 2—4 
in [2]). 


Contribution. The contribution of this work is threefold. First, we determine 
sufficient conditions ensuring that the standard quantitative bottom-up analysis 
of attack—defense trees with repeated labels is valid. Second, we prove that some 
of these conditions are in fact necessary for the analysis to be compatible with 
a selected semantics for attack—defense trees. Finally, for the case when these 
conditions are not satisfied, we propose a novel, alternative method of evaluation 
of attributes that takes the presence of repeated labels into account. 


Paper Structure. The model of attack—defense trees is introduced in detail in the 
next section. In Sect. 3, the attributes and exisiting methods for their evaluation 
are explained. In Sect. 4, we present our main results on quantification of attack— 
defense trees with repeated labels. We give proofs of these results in Sect. 5, and 
conclude in Sect. 6. 
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2 Attack—Defense Trees 


Attack—defense trees are rooted trees with labeled nodes that allow for an intu- 
itive graphical representation of scenarios involving two competing actors, usu- 
ally called attacker and defender. Nodes of a tree are labeled with goals of the 
actors, with the label of the root of the tree being the main goal of the modeled 
scenario. The actor whose goal is represented by the root is called proponent and 
the other one is called opponent. The aim of the proponent is to achieve the root 
goal, whereas the opponent tries to make this impossible. 

In order for an actor to achieve some particular goal g, they might need to 
achieve other goals. In such a case the node labeled with g is a refined node. The 
basic model of attack—defense trees (as introduced in [18]) admits two types of 
refinements: the goal of a conjunctively refined node (an AND node) is achieved 
if the goals of all its child nodes are achieved, and the goal of a disjunctively 
refined node (an OR node) is achieved if at least one of the goals of its children 
is achieved. If a node is not refined, then it represents a goal that is considered 
to be directly achievable, for instance by executing a simple action. Such a goal 
is called a basic action. Hence, in order to achieve goals of refined nodes, the 
actors execute (some of) their basic actions. What distinguishes attack—defense 
trees model from attack trees is the possibility of the goals of the actors to be 
countered by goals of their adversary, which themselves can be again countered, 
and so on. To represent the countering of a goal, the symbol C will be used. A 
goal g is countered by a goal g’ (denoted C(g,g’)) if achieving g’ by one of the 
actors makes achieving g impossible for the other actor. 

It is not rare that in an attack—defense tree, whether generated by hand or 
in a semi-automatic way [14,25,28] some nodes bear the same label. In such a 
case, there are two ways of interpreting them: 


1. either the nodes represent the same single instance of the goal — e.g., cutting 
the power off in a building can be done once and has multiple consequences, 
thus a number of refined nodes might have a node labeled cutPower0ff among 
their child nodes, but all these nodes will represent exactly the same action 
of cutting the power off; 

2. or else each of the nodes is treated as a distinct instance of the goal. For 
instance, while performing an attack, the attacker might need to pass through 
a door twice — once to enter and second time to leave a building. Since these 
actions refer to the same door and the same attacker, the corresponding nodes 
will, in most cases, hold the same label goThroughDoor. However, it is clear 
that they represent two different instances of the same goal. 


In this work we assume the first of these ways of interpretation. In particular, 
following [5], we call a basic action that serves as a label for at least two nodes 
a clone or a cloned basic action, and interpret them as the same instance of a 
goal. Nodes representing distinct instances of the same goal or distinct goals are 
assumed in this work to have different labels. 
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An example of attack—defense tree! is represented in Fig. 1. In this tree, the 
proponent is the attacker and the opponent is the defender. According to the 
attack—defense trees’ convention, nodes representing goals of the attacker are 
depicted using red circles, and those of the defender using green rectangles. 
Children of an AND node are joined by an arc, and countermeasures are attached 
to nodes they are supposed to counter via dotted edges. 


Example 1. In the attack-defense scenario represented by the attack—defense 
tree from Fig.1, the proponent wants to steal money from the opponent’s 
account. To achieve this goal, they can use physical means, i.e., force the oppo- 
nent to reveal their PIN, steal the opponent’s card and then withdraw money 
from an ATM. One way of learning the PIN would be to eavesdrop on the victim 
when they enter the PIN. This could be prevented by covering the keypad with 
hand. Covering the keypad fails if the proponent monitors the keypad with a 
hidden micro—camera installed at an appropriate spot. Another way of getting 
the PIN would be to force the opponent to reveal it. 

Instead of attacking from a physical angle, the proponent can steal money by 
exploiting online banking services. In order to do so, they need to learn the oppo- 
nent’s user name and password. Both of these goals can be achieved by creating 
a fake bank website and using phishing techniques for tricking the opponent into 
entering their credentials. The proponent could also try to guess what the pass- 
word and the user name are. Using very strong password would counter such 
guessing attack. Once the proponent obtains the credentials, they use them for 
logging into the online banking services and execute a transfer. Transfer disposi- 
tions might be additionally secured with two-factor authentication using mobile 
phone text messages. This security measure could be countered by the proponent 
by stealing the opponent’s phone. 


Note that even though there are two nodes labeled with phishing in the tree, 
they actually represent the same instance of the same action. The proponent does 
not need to perform two different phishing attacks to get the password and the 
user name—setting up one phishing website and sending one phishing e-mail will 
suffice for the proponent to get both credentials. Thus, the two nodes labeled 
phishing are clones. 

Let us now introduce a formal notation for attack—defense trees, which we 
will use throughout this paper. Such notation is necessary to formally define the 
meaning of attack—defense trees in terms of formal semantics and to specify the 
algorithms for their quantitative analysis. 

We use symbols p and o to distinguish between the proponent and the oppo- 
nent. By BP and B° we denote the sets of labels representing basic actions of 
the proponent and of the opponent, respectively. We assume that BP N B° = Ó, 
and we set B = BP U B°. For s € {p, o}, the symbol 5 stands for the other actor, 
i.e., p = o and 6 = p. We denote the elements of B® with bë, for s € {p,o}. 
Attack—defense trees can be seen as terms generated by the following grammar, 
where ORS and AND® are unranked refinement operators, i.e., they may take an 
arbitrary number of arguments, and C5 is a binary counter operator. 


1 The example is based on one of exemplary trees provided by ADTool [9]. 
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TS: b® | ORS(T°,..., T°) | AND°(TS,..., T°) | C(T°, T5) (1) 


Example 2. Consider the tree from Fig. 1. The term corresponding to the subtree 
rooted in the via ATM node is 


ANDP (or (cP (eavesdrop, C°(coverKey, camera)) : force) , stealCard, 
witharawCash ) ; 


where the labels of basic actions have been shortened for better readability. 


We denote the set of trees generated by grammar (1) with T. 

In order to analyze possible attacks in an attack—defense tree, in particu- 
lar, determine cheapest ones, or the ones that require the least amount of time 
to execute, one needs to decide what is considered to be an attack. This can 
be achieved with the help of semantics that provide formal interpretations for 
attack—defense trees. Several semantics for attack—defense trees have been pro- 
posed in [18]. Below, we recall two ways of interpreting attack—defense trees and 
the notions of attack they entail. 


Definition 1. The propositional semantics for attack—defense trees is a function 
P that assigns to each attack—defense tree a propositional formula, in a recursive 
way, as follows 


P(b) = ap, PORT) 2,1) = POV VP), 
P(CS(T?, T3)) = P(TÈ) A =P (T3), P(AND*(T7,...,Tp)) = P(T) A A P(T) 


where b € B, and x», is the corresponding propositional variable. Two attack- 
defense trees are equivalent wrt P if their interpretations are equivalent propo- 
sitional formule. 


Definition 1 formalizes one of the most intuitive and widely used ways of 
interpreting attack—defense trees, where every basic action is assigned a propo- 
sitional variable indicating whether or not the action is satisfiable. In the light 
of the propositional semantics, an attack in an attack—defense tree T is any 
assignment of values to the propositional variables, such that the formula P(T) 
evaluates to true. We note that this natural approach is often used without invok- 
ing the propositional semantics explicitly (e.g., in [2] or [8]). Observe also that 
due to the idempotency of the logical operators V and ^, and the fact that every 
basic action is assigned a single variable, when the propositional semantics is 
used, cloned actions are indeed treated as the same instance of the same action. 
In particular, this implies that the trees ANDP (b, ORP(b,b’)) and b are equiva- 
lent under the propositional interpretation. Such approach might not always be 
desirable, especially when we do not only want to know whether attacks are 
possible, but actually how they can be achieved. To accommodate this point of 
view, the set semantics has recently been introduced in [5]. We briefly recall its 
construction below. 
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In the sequel, we set 


SOZ= {(Ps U Pz,Og U Oz)|(Ps, Os) ES, (Pz,Oz) E Zy}, (2) 


for S,Z C BP x B°. Furthermore, for a set X we denote its power set with p(X). 


Definition 2. The set semantics for attack-defense trees is a function S: T > 
plpl BP) x p( B°)) that assigns to each attack—defense tree a set of pairs of sets 
of labels, as follows 


S() = (h0) S) = {0h 
S(OR?(TP,..., TP)) = U S(T, ), SORT ox) = © S(T?), 
S(ANDP(TP,...,TP)) = Ae TP), S(AND°(TẸ,..., TR)) = Ü S(T?), 


t 


S(C?(TP,T3)) = S(TP) Si S(c°(T?, T?)) = S(T?) 


Two trees T; and T> are equivalent wrt the set semantics, denoted Ti =s To, if 
and only if the two sets S(T,) and S(T2) are equal. 


The meaning of a pair (P,O) belonging to S(T) is that if the proponent 
executes all actions from P and the opponent does not execute any of the actions 
from O, then the root goal of the tree T is achieved. In particular, if (P, Ø) € 
S(T), then the opponent cannot prevent the proponent from achieving the root 
goal when they execute all actions from P. 


Example 3. The set semantics of the tree in Fig. 1 is the following 


S(T) = {({force, stealCard, withdrawCash}, Ø), 

({camera, eavesdrop, stealCard, withdrawCash}, Ø), 
{eavesdrop, stealCard, withdrawCash}, {coverKey}), 
{phish, logInkexecTrans}, {SMS}), 

{phish, guessUN, logIn&kexecTrans}, {SMS}), 

{phish, guessPwd, logIn&execTrans}, {strongPWD, SMS}), 
{guessUN, guessPwd, logInkexecTrans}, {strongPWD, SMS}), 
({phish, stealPhone, logInkexecTrans}, Ø), 


{phish, guessUN, stealPhone, logIn&gexecTrans}, Ø), 
({phish, guessPwd, stealPhone, logInkexecTrans}, {strongPWD}), 
({guessUN, guessPwd, stealPhone, logInkexecTrans}, {strongPWD})}. 


Throughout the rest of the paper, by an attack in an attack—-defense tree T 
we mean an element of its set semantics S(T). 

Grammar (1) ensures that attack—defense trees are well-typed with respect 
to the two players, i.e., p and o. However, not every well-typed tree is necessar- 
ily well-formed wrt the labels used. In particular, it should be ensured that the 
usage of repeated labels is consistent throughout the whole tree. For instance, 
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if the action coverKey, of covering an ATM’s keypad with a hand, can be 
countered by monitoring with a camera, this countermeasure should also be 
attached to every other node labeled coverKey. Similarly, if execution of the 
action logIn&execTrans contributes to the achievement of the proponent’s goal 
of stealing money via the online banking services, this information should be 
kept in every subtree rooted in a node labeled via online banking. Thus, to 
ensure that the results of the methods developed further in the paper indeed 
reflect the intended aspects of a modeled scenario, in the following we assume 
that subtrees of an attack—defense tree that are rooted in identically labeled 
nodes are equivalent wrt the set semantics. 


3 Quantitative Analysis Using Attributes 


Among methods for quantitative analysis of scenarios modeled with attack— 
defense trees are so called attributes, introduced intuitively by Schneier in [26] 
and formalized for attack trees in [15,22], and for attack—defense trees in [18]. 
Attributes represent quantitative aspects of the modeled scenario, such as a 
minimal cost of executing an attack or maximal damage caused by an attack. 
Numerous methods to evaluate the value of an attribute on attack—defense trees 
exist [2,8], and the most often used approach is based on so called bottom-up 
evaluation [18]. The idea behind the bottom-up evaluation is to assign attribute 
values to the basic actions and to propagate them up to the root of the tree using 
appropriate operations on the intermediate nodes. The notions of attribute and 
bottom-up evaluation are formalized using attribute domains. 


Definition 3. An attribute domain for an attribute a on attack—-defense trees 
is a tuple 
Ag = (Da, ORÈ , AND? , ORS, AND9, Ch, C3), 


where Da is a set, and for s € {p,o}, OP € {OR, AND}, 


1. OP§, is an unranked function on Da, 
2. C is a binary function on Da. 


Let Aa = (Da, ORÈ, ANDP , OR®, AND9 , CB, C9) be an attribute domain. A func- 
tion Ba: B — Da that assigns values from the set Da to basic actions of attack- 
defense trees is called a basic assignment for attribute a. 


Definition 4. Let A, = (Dj,,ORP,ANDP,OR®,AND®,CP,C°) be an attribute 
domain, T be an attack-defense tree, and Ba be a basic assignment for attribute 
a. The value of attribute a for T obtained via the bottom-up procedure, denoted 
ap(T, Ba), is defined recursively as 


Balb) if T =b,b € B, 
ap(T, Ba) = < OP (aB(TÈ, Ba),---,aB(L3,6a)) if T = 0P°(TẸ,..., TS), 
cà (aB(T7, Ba), aB (T3, Ba)) if T E (Ty, T3), 


where s € {p,0}, OP € {OR, AND}. (In the notation ap(T, Ba), the index B refers 
to the “bottom-up” computation. ) 
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An extensive overview of attribute domains and their classification can be 
found in [19]. The article [4] contains a case study and guidelines for practical 
application of the bottom-up procedure. Numerous examples of attributes for 
attack trees and attack trees extended with additional sequential refinement have 
been given in [13,15]. We gather some relevant attribute domains for attack- 
defense trees in Table 1. 


Table 1. Selected attribute domains for attack—defense trees 


Attribute Da OR? | AND? | oR? | ANDo|c2 leg | Ba(b°) 
min. attack cost | Rso U {+oo} | min | + + [min |+ [min | +00 
max. damage R>o U {—00} | max | + + [|max |+ |max|-—oo 
min. skill level NU {0, +00} | min | max | max |min | max | min | +00 
min. nb of experts | NU {0, +00} | min | + + [min |+ | min | +00 
satisfiability for p {0,1} V A A V A V 0 


Example 4 illustrates the bottom-up procedure on the tree from Fig. 1. 


Example 4. Consider the tree T given in Fig. 1, and let a be the minimal attack 
cost attribute (see Table 1 for its attribute domain). We fix the basic assignment 
Beost to be as follows: 


basic action b Beost (b) basic action b Beost (b) 
stealCard 60 force 100 
camera 75 withdrawCash 10 
phish 70 eavesdrop 20 
guessPwd 120 guessUN 120 
logInkexecTrans 10 stealPhone 60 


Furthermore, for every basic action b of the opponent, we set Geost(b) = +00. 
The bottom-up computation of the minimal cost on T gives 


cost p(T, Beost) = 165. 


This value corresponds to monitoring with the camera, eavesdropping on the 
victim to learn their PIN, stealing the card, and withdrawing money. 


As already noticed in [22], the value of an attribute for a tree can also be 
evaluated directly on its semantic. For our purposes we define this evaluation as 
follows. 


Definition 5. Let (Da, ORB, ANDP, ORG, AND9, CP, C9) be an attribute domain and 
let T be an attack—-defense tree with a basic assignment Ba. The value of the 
attribute a for T evaluated on the set semantics, denoted as(T, Ba), is defined as 
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BOB) = OD woes (cz (ANDP Jye pBa(b), (OR,)nco.a() ) | 


“ 


(In the notation ag(T, Ba), the index S refers to the computation on the “set 


semantics”. ) 


Example 5. Consider again the tree from Fig. 1 and the basic assignment for the 
minimal cost attribute given in Example 4. The cost of all elements of the set 
semantics for T are as follows 


({force, stealCard, withdrawCash}, Ø), 170 
({camera, eavesdrop, stealCard, withdrawCash}, Ø), 165 
({eavesdrop, stealCard, withdrawCash}, {coverKey}) +00 
({phish, logIn&execTrans}, {SMS}), +00 
({phish, guessUN, logIn&execTrans}, {SMS}), +00 
({phish, guessPwd, logInkexecTrans}, {strongPWD, SMS}), + oo 
({guessUN, guessPwd, logInkexecTrans}, {strongPWD, SMS}), +00 
({phish, stealPhone, logIn&execTrans}, 0), 140 
({phish, guessUN, stealPhone, logIn&execTrans},()), 260 
({phish, guessPwd, stealPhone, logIn&execTrans}, {strongPWD}), +00 
({guessUN, guessPwd, stealPhone, logInkexecTrans},{strongPWD}) + 00. 


The evaluation of the minimal cost attribute on the set semantics for T gives 
as(T, Beost) = min{170, 165, +00, 140, 260} = 140, 


which corresponds to performing the phishing attack to get the user name and 
their password, stealing the phone, and logging into the online bank application 
to execute the transfer. 


Notice that the values obtained for the same tree in Examples 4 and 5 are 
different, despite the fact that the same basic assignment and the same attribute 
domain have been used. This is due to the fact that the tree from Fig. 1 contains 
cloned nodes which the standard bottom-up evaluation cannot handle properly. 
In the next section, we provide conditions and develop a method for a proper 
evaluation of attributes on attack—defense trees with cloned nodes. 


4 Quantification On Attack—Defense Trees with Clones 


Depending on what is considered to be an attack in an attack—defense tree, 
different semantics can be used. Note that a semantics for attack—defense trees 
naturally introduces an equivalence relation in T. It is thus of great importance to 
select a method of quantitative analysis that is consistent with a chosen seman- 
tics, i.e., a method that for any two trees equivalent wrt the employed semantics 
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returns the same result. This issue was recognized by the authors of [22] for 
attack trees, and addressed, in the case of attack-defense trees in [18], with the 
notion of compatibility between an attribute domain and a semantics. Below, we 
adapt the definition of compatibility from [18] to the bottom-up computation. 


Definition 6. Let Aa = (Do,ORR,ANDP,OR?,AND2,CP,C°) be an attribute 
domain. The bottom-up procedure, defined in Definition 4, is compatible with 
a semantics = for attack-defense trees, if for every two trees Ti, To satis- 
fying Ti = To, the equality ag(Ti, ba) = asB(Tz, ba) holds for any basic 
assignment Ba. 


For instance, it is well-known that the bottom-up computation of the mini- 
mal cost using the domain from Table1 is not compatible with the proposi- 
tional semantics. Indeed, consider the trees Tı = ORP(b, AND(b’,b”)) and To = 
ANDP (ORP (b, b’), (b, b”)) whose corresponding propositional formule are equiva- 
lent. However, for the basic assignment (cost (b) = 3, Beost(b’) = 4, Beost(b”) = 1 
the values ag(T1, Ga) = 3 and ap(T2,8.) = 4 are different. Similarly, the 
bottom-up computation of the minimal cost attribute is not compatible with 
the set semantics. This can be shown by considering trees T3 = ANDP(b,b) and 
T4 = b and will further be discussed in Corollary 1. 

This notion of compatibility defined in Definition 6 can be generalized to any 
computation on attack—defense trees. 


Definition 7. Let D be a set and let f be a function on T x D. We say that f 
is compatible with a semantics = for attack-defense trees, if for every two trees 
Tı, To satisfying Tı = Tə the equality f(T, d) = f(T2,d) holds for any dE D. 


To illustrate the difference between the compatibility notions defined in Def- 
initions 6 and 7, one can consider the method for computing the so called 
attacker’s expected outcome, proposed by Jürgenson and Willemson in [17]. Since 
this method is not based on an attribute domain, it cannot be simulated using 
the bottom-up evaluation. However, the authors show that the outcome of their 
computations is independent from the Boolean representation of an attack tree. 
This means that the method proposed in [17] is compatible with the proposi- 
tional semantics for attack trees. 


Remark 1. Consider an attribute domain Ag = (Da, ,®,®,®,®,®) with 4 
and ® being binary, associative, and commutative operations on Dy”. Under 
these assumptions, for a tree T and a basic assignment Ba, we have 


as(T, Ba) = B (Bgo gam) 


(P,O)ES(T) bEP bEO 


= @ @ Bato) 


(P,O)ES(T) bE PUO 


? Note that a binary and associative operation can be modeled with an unranked 
operator. 
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Since for any two trees T; and T> that are equivalent wrt the set semantics the 
expressions ag (T1, 8a) and ag(T>, Ba) differ only in the order of the terms, they 
yield the same (numerical) result. In other words, under the above assumptions, 
the computation as is compatible with the set semantics. 


As it has been observed in [18,19], there is a wide class of attribute domains 
of the form (Da, 8, 8, ®, ®, 9,8), where (Da, 8, Q) constitutes a commutative 
idempotent semiring. Recall that an algebraic structure (R, 9,8) is a commu- 
tative idempotent semiring if © is an idempotent operation, both operations 6 
and ® are associative and commutative, their neutral elements, denoted here 
by eg and eg, belong to R, operation © distributes over 6, and the absorbing 
element of ®, denoted ag, is equal to eg. 


Remark 2. In order for the computations performed using the bottom-up eval- 
uation to be consistent with the intuition, the basic actions of the oppo- 
nent are assigned a specific value. In the case of an attribute domain 
(Da, 8, ®, ®, 8, ®, ®) based on a commutative idempotent semiring (Da, ®, 8) 
this value is equal to ag. One of the consequences of this choice is that if for 
every attack (P,O) € S(T) the set O is not empty, then as(T, Ba) = ag = ea, 
indicating the fact that the proponent cannot achieve the root goal if the oppo- 
nent executes all of their actions present in the tree. Note that this is closely 
related to the choice of the functions Ch = ® and C3 = ©. 


Example 6. For instance, in the case of the minimal cost attribute domain 
(cf. Table 1), which is based on the idempotent commutative semiring (Rso U 
{+oo},min,+), the basic actions of the opponent are originally assigned +co, 
which is both a neutral element for the min operation, and the absorbing element 
for the addition. This implies that, if on a certain path, there is an opponent’s 
action which is not countered by the proponent, the corresponding branch will 
result in the value +00, which models that it is impossible (since too costly) for 
the proponent. This is due to the fact that CP s = +. However, if the opponent’s 
action is countered by the proponent’s action, the corresponding branch will 
yield a real value different from +00, because the min operator, used for Ce. .:, 
will be applied between a real number assigned to the proponent’s counter and 
the +00. 


The first contribution of this work is presented in Theorem 1. It establishes a 
relation between the evaluation of attributes via the bottom-up procedure and 
their evaluation on the set semantics. Its proof is postponed to Sect. 5. 


Theorem 1. Let T be an attack-defense tree generated by grammar (1) and let 
Ag = (Da, 8,8, ®, 8,8, 8) be an attribute domain such that the operations ® 
and ® are associative and commutative, ® is idempotent, and ® distributes over 
p. If 


— there are no repeated labels in T, or 
— the operator ® is idempotent, 


then the equality ap(T, Ba) = as(T, Ba) holds for any basic assignment Ba. 
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Note that the assumptions of Theorem 1 are satisfied by any commutative idem- 
potent semiring, thus the same result also holds for attributes whose attribute 
domains are based on commutative idempotent semirings. Furthermore, one can 
compare the assumption on the lack of repeated labels in Theorem 1 with the lin- 
earity of an attack—defense tree, considered in [2]. The authors of [2] have proven 
that under this strong assumption, the evaluation method that they have devel- 
oped for multi-parameter attributes coincides with their bottom-up evaluation. 


Remark 3. Consider again the attribute domain specified in Theorem 1. Suppose 
that the operation ® is not idempotent. Then there exists d € Da, such that 
d&d # d. In consequence, for 3,(b) = d and the trees T) = b and Tz = 
AND” (b, b) that are equivalent wrt to the set semantics, we have ap(T), Ga) Æ 
ap(T2, Ba). This shows that if the operation & is not idempotent, then the 
bottom-up evaluation based on the attribute domain satisfying the remaining 
assumptions of Theorem 1 is not compatible with the set semantics. 


Theorem 1 and Remarks 1 and 3 immediately yield the following corollary. 


Corollary 1. Let Aa = (Da, ©, ®, 8,8, ®, D) be an attribute domain such that 
the operations ® and Q are associative and commutative, ® is idempotent, and 
® distributes over Ð. The bottom-up procedure based on Aa is compatible with 
the set semantics if and only if the operation ® is idempotent. 


We can also notice that if the assumptions from Corollary 1 are satisfied but 
the operation ® is not idempotent, then the bottom-up procedure is compatible 
with the so called multiset semantics (introduced for attack trees in [22] and 
attack—-defense trees in [18]) which uses pairs of multisets instead of pairs of 
sets. 

Some of the domains based on idempotent semirings have a specific property 
that we encapsulate in the notion of non-increasing domain. 


Definition 8. Let Aa be an attribute domain. We say that Aa is non-increasing 
if Aa = (Da, 8, ®, ®, 8, ®, 8), (Da, 2,8) is a commutative idempotent semir- 
ing, and for every d,c € Da, the inequality d&c < d holds, where < stands for 
the canonical partial order on Da, i.e., the order defined by d < c if and only if 
d@c=c. 


Example 7. From the attribute domains presented in Table1 all but one are 
non-increasing. The only one which is not non-increasing is the maximal damage 
domain. 


Note that in order to be able to evaluate the value of an attribute on the set 
semantics S(T), one needs to construct the semantics itself. This task might be 
computationally expensive, since, in the worst case, the number of elements of 
S(T) is exponential in the number of nodes of T. In contrast, the complexity of 
the bottom-up procedure is linear in the number of nodes of the underlying tree 
(if the operations performed on the intermediate nodes are linear in the num- 
ber of arguments). Thus, it is desirable to ensure that ag(T, Ga) = as(T, Ba). 
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By Theorem 1, this equality holds in a wide class of attributes, provided that 
there are no clones in T. If T contains clones, then the two methods might return 
different values (as illustrated in Remark 3). 

To deal with this issue, we present our second contribution of this work. In 
Algorithm 1, we propose a method of evaluating the value of attributes hav- 
ing non-increasing domains on attack-defense trees, that takes the repetition of 
labels into account. The algorithm relies on the following notion of necessary 
clones. 


Definition 9. Let b be a cloned basic action of the proponent in an attack- 
defense tree T. If b is present in every attack of the form (P,0) € S(T), then b 
is a necessary clone; otherwise it is an optional clone. 


It is easy to see that the tree from Fig.1 does not contain any necessary 
clones. Indeed, this tree contains only one clone — phish — however, there exists 
the attack ({force, stealCard, withdrawCash},@) which does not make use of 
the corresponding phishing action. 

The sets of all necessary and optional clones in a tree T are denoted with 
Cn(T) and Co(T), respectively. When there is no danger of ambiguity, we use Cy 
and Co instead of Cy (T) and Co(T). The idea behind Algorithm 1 is to first rec- 
ognize the set Cy of necessary clones and temporarily ensure that the values of 
the attribute assigned to them do not influence the result of the bottom—up pro- 
cedure. Then the values of the optional clones are also temporarily modified, and 
the corresponding bottom-up evaluations are performed. Only then the result 
is adjusted in such a way that the original values of the necessary clones are 
taken into account. Before explaining Algorithm 1 in detail, we provide a sim- 
ple method for determining whether a cloned basic action of the proponent is a 
necessary clone in the following lemma. 


Lemma 1. Let T be an attack-defense tree generated by grammar (1) anda € 
BP be a cloned action of the proponent in T. Let a be the minimal skill level 
attribute (cf. Table 1) with the following basic assignment, for b € B 


0 ifb 4a and b € BP 
Psxin(b) = 41 ifb =a, 
+oo otherwise. 


3 


Then, a is a necessary clone in T if and only if skillg (T, skin) = 1. 


Proof. Observe that under the given basic assignment the value of skills (T, Gsxin) 
is equal to 1 if and only if a is a necessary clone. Since max is an idempo- 
tent operation, skille(T, Gein) = skills(T, skin), by Theorem 1. The lemma 
follows. 


We now explain our algorithm for evaluating attributes on attack—defense 
trees with repeated labels. Algorithm 1 takes as input an attack—defense tree T 
generated by grammar (1), an attribute domain Aa, and a basic assignment Ba 
for the attribute. Once the sets of necessary and the optional clones have been 
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determined, new basic assignments are created. Under each of these assignments 

/ , the necessary clones receive value eg (in line 3). Intuitively, this ensures two 
things. First, that when the bottom-up procedure with the assignment /3/, is 
performed (in line 8), the value selected at the nodes corresponding to a choice 
made by the proponent (e.g., at the ORP nodes) is likely to be the one corre- 
sponding to a subset of actions of some optimal attack (i.e., a subset containing 
a necessary clone). The second outcome is that in the final result of the algo- 
rithm, the values of Ba assigned to the necessary clones are taken into account 
exactly once (line 11). 


Algorithm 1. Evaluation of attributes in attack—defense tree with clones 
Input: Attack—defense tree T, attribute domain (Da, 98,89,89, 9,8, 9), Ba: B > Da 
oe aa(T, Ba) 
: aA(T, Ba) — eg 
initialize Cn, Co 
Bi(b) — eg for every b € Cn 
B4(b) — Ba(b) for every b € B\ (Cn U Co) 
for every subset C C Co do 
B4(b) — aw for every b € C 
B4(b) — eg for every b € Co \C 
r° — aB(T, Ba) ® reco \c Ba(b) 
9: aa(T, Ba) — aa(T, Ba) 8 re 
10: end for 
11: A(T; Ba) — A(T, Ba) ® Brec, Balb) 
12: return a4(T, ba) 


In lines 6-7, an assignment (3), is created for every subset C of the set of 
optional clones Co. The clones from C are assigned ag, which intuitively ensures 
that they are ignored by the bottom-up procedure, and the remaining optional 
clones are assigned eg (again, to ensure that their values under Ba will eventually 
be counted exactly once). The result of computations performed in the for loop 
is multiplied (in the sense of performing operation @) in line 11 by the product 
of values assigned to the necessary clones. (Note that the index A in the notation 
aa(T, Ba) refers to the evaluation using Algorithm 1.) 


Example 8. We illustrate Algorithm 1 on the tree T from Fig. 1 and the minimal 
cost attribute domain. Consider the basic assignment of cost given in Example 4. 
Observe that Cy = and Co = {phish}. 

The sets C considered in the for loop, their influence on the assignment of 
cost, and their corresponding results r° are the following 


C=), Goce phish) = 0, r° = 140, 
C = {phish}, Beost (phish) = +00, r° = 165. 
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The value of cost4(T, cost) after the for loop is min{140,165}. Since Cy = 
Ø, the algorithm returns cost4(T, Gcost) = 140. This value corresponds to the 
cost of the attack ({phish, logInkexecTrans, stealPhone}, Ø), which is indeed 
the cheapest attack in the tree under the given basic assignment, as already 
illustrated in Example 5. Notice furthermore, that cost A(T, Beost) = &s (T, Beost) 
(cf. Example 5). 


Now we turn our attention to complexity of Algorithm 1. Let k be the number 
of distinct clones of the proponent in T. Furthermore, let n be the number of 
nodes in T. We assume that the complexity of operations © and ® is linear in 
the number of arguments, which is a reasonable assumption in the view of the 
existing attribute domains (cf. Table 1). This implies that the result of a single 
bottom up-procedure in T is obtained in time O(n). Thus, from the operations 
performed in lines 1—4, the most complex one is the initialization of the sets 
Cy and Co, the time complexity of which is in O(kn) (by Lemma 1). Since the 
for loop from line 5 iterates over all of the subsets of the optional clones, and 
the operations inside the loop are linear in n, the overall time complexity of 
Algorithm 1 is in O(n2*). 

In Theorem 2 we give sufficient conditions for the result a4(T, Ba) of Algo- 
rithm 1 to be equal to the result as(T, Ba) of evaluation on the set semantics. 
Its proof is presented in Sect. 5. 


Theorem 2. Let T be an attack-defense tree generated by grammar (1) and Ax 
be a non-increasing attribute domain. Then the equality aa (T, Ba) = as(T, Ba) 
holds for every basic assignment By: B— Da satisfying Balgo = ag- 


Remark 1 and Theorem 2 imply the following corollary. 


Corollary 2. Let Aa = (Da, ®,9,9,9,89,9) be a non-increasing attribute 
domain and let B := {84: B > Da st Balgo = ag}. Then, the evaluation pro- 
cedure aa: T x B —> Da specified by Algorithm 1 is compatible with the set 
semantics (in the sense of Definition 7). 


5 Proofs of Theorems 1 and 2 


Throughout this section it is assumed that T is an attack—defense tree generated 
by grammar (1) and Aa = (Da, 9,8, ®, 8, ®,@) is an attribute domain with 
the operations © and ® that are associative and commutative, © is idempotent, 
and ® distributes over 6. We begin with examining parallels between attribute 
domains of this type and the set semantics. 

Since the operation ® distributes over ®, the result of the bottom-up pro- 
cedure for any basic assignment a of a can be represented as 


ap(T, Ba) = (Balbi) 8 balba) @ ... ® Ba(b;,))® 


® (Ba(b1) ® Balb) ®.-.® Ba (bh, ))S (3) 


® (Baldi) @ Baldy) @ --- D Ba (bk, ))- 
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Observe that with the set Ds = (g(BP) x p(B°)) and the opera- 
tion © defined by equality (2), the algebraic structure (Ds,U, ©) constitutes 
a commutative idempotent semiring. Consider the attribute domain Ag = 
(Ds,U,©,©,U, ©, U) and the basic assignment 


_ J {({b},0)} if © BP, 
a He {b})} otherwise. 


Clearly, S(T) = Sp(T, Bs). By the previous observations Sp(T, 8s) can be 
represented as 


S(T, Bs) = (Bs (bj) © Bs (bg) © --- © Bs (by, ))U 
U (Bs(b}) © 8s (t5) © -+ © Bs(d},))U (4) 


U (Øs (b7) © Bs (b2) ©--- © Bs (bk, )). 


We chose the representations (3) and (4) in such a way that for i € {1,...,n} 
and j € {1,...,ki} the basic action bj in (3) is the same as bi in (4), which is 
possible due to the commutativity of the operations. 

From definitions of the basic assignment Bs and the operation © it follows 
that for every i € {1,...,n} the ith term 


Bs(b}) © Bs(b) © +++ © Bs (bi,,) 


of representation (4) is a set consisting of exactly one pair of sets. Let us 
denote this term with {(P;,O;)}. Observe that since S(T) = Sp(T, fs), we 
have (P;,O;) € S(T) for every i, and, conversely, for every (P,O) € S(T) there 
exists at least one i such that (P,O) = (P;,O;). 

Finally, we denote the ith term of representation (3) with a;. Now we are 
ready to prove Theorem 1. 


Proof of Theorem 1. If there are no repeated labels in T or the operator ® is 
idempotent, then for i € {1,...,n} it holds that a; = ®yep,uo, Ba(b). Together 
with the idempotency of © this implies that 


ap(T, Ba) = Dai = B Q Balb): 


i=1 (P,O)ES(T) bEPUO 


We finish this section by providing the proof of Theorem 2. 


Proof of Theorem 2. Consider a result r° of the bottom-up procedure obtained 
in the line 8 of Algorithm 1 for a set C C Co of optional clones. Using represen- 
tation (3), it can be written as 
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r° =ap(T,64)9 Q) Bald) 
bECo\C 


LAMAER TA ES Bald)e 


bECo\C 


a (64,(b}) 8 8L) @...@ bh) ®@ &) Balb)e 


bECo\C 


® (B,(b7) ® Ba (bz) @... B Bak) S Q) Balb). 


bECo\C 


Let us denote the ith term of the above expression with rf. Observe that the 
result of Algorithm 1 is 


aa(T, Ba) = B r| 8 Q Balb) = a> B ri ® Q) Bo(b). 


CCCo bECN i=1 | CCCo bECN 


Due to the values assigned to the optional clones in the for loop, the inner 
expression can be expanded as follows. 


CCCo bECo\C 
CN(P;UO;)4O 


gr- D Bse Q wo) 


p PB Q Palb) 8 Q eg & Q Bab) 


CCCo bE P,UO; be P;UO; bECo\C 
CN(P;UO;)=0 Lb¢CnNUCO bEC NUCO\C 
= D | Q ahe Q eh) 

CCCo bE P;UO; bé P;UO; 
CN(P;UO;)=0 béCn bECo\C 


Since the attribute domain is non-increasing, the last “sum” is absorbed by the 
term corresponding to the set C satisfying Co \C = (P; UO;) N Co, namely, the 
term ®pvep,uo; Balb). Thus, 

b¢éCn 
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n 


aa(T, ba) = B Q Balb) 8 Q Ba(b) 


i=1 | bE P;ULO; bECN 
b¢éCn 
=P Q a= PDP Q Bate), 
i=1 bE€P;UO; (P,O)ES(T) be PUO 


where the second equality follows from definition of necessary clones and the fact 
that Salgo = ag, and the last one holds by the idempotency of ©. The proof is 
complete. 


6 Conclusion 


The goal of the work presented in this paper was to tackle the issue of quantita- 
tive analysis of attack—defense trees in which a basic action can appear multiple 
times. We have presented conditions ensuring that in this setting the classi- 
cal, fast bottom-up procedure for attributes evaluation yields valid result. For a 
subclass of attributes, we have identified necessary and sufficient condition for 
compatibility of the bottom-up evaluation with the set semantics. A constructive 
method of evaluation of attributes belonging to a wide and important subclass 
of attributes, that takes the presence of repeated labels into account, has been 
presented. 

This work addresses only the tip of the iceberg of a much larger problem 
which is the analysis and quantification of attack—defense trees with dependent 
actions. The notion of clones captures the strongest type of dependency between 
goals, namely where the nodes bearing the same label represent exactly the same 
instance of the same goal. It is thus obvious that the attribute values for the 
clones should only be considered once in the attribute computations. However, in 
practice, weaker dependencies between goals may also be present. For instance, 
when the attacker has access to a computer with sufficient computation power, 
the attack consisting in guessing a password becomes de facto the brute force 
attack and can be performed within a reasonable time, for most of the passwords 
used in practice. In contrast, if this attack is performed manually, it will, most 
probably, take much longer to succeed. Similarly, if the attacker knows the vic- 
tim, guessing their password manually will, in most cases, be faster compared 
to the situation when the attacker is a stranger to the victim. Of course, this 
problem can be solved by relabeling the nodes and using differently named goals 
for the two situations. However, this solution is not in line with the practical 
usage of attack(—defense) trees whose construction often relies on preexisting 
libraries of attack patterns where the nodes are already labeled and the labels 
are as simple as possible. We are currently working on improving the standard 
bottom-up evaluation procedure for attributes (in the spirit of Algorithm 1) to 
accommodate such weakly dependent nodes. 

Furthermore, it would be interesting to try to generalize Algorithm 1 for the 
approaches proposed in the past for the restricted class of attack—defense trees 
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without repeated labels. Such approaches include for instance multi-objective 
optimization defined in [2] and a method for selecting the most suitable set of 
countermeasures, based on integer linear programing, developed in [21]. 


Acknowledgments. We would like to thank Angèle Bossuat for fruitful discussions on 
the interpretation of repeated labels in attack-defense trees and on possible approaches 
to the problem of quantification in the presence of clones. 
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